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■ 1 • 

NEW SEQUENCES OF HEPATITIS C VIRUS GENOTYPES AND THEIR USE AS 
THERAPEUTIC AND DIAGNOSTIC AGENTS 

The invention relates to new sequences of hepatitis C virus (HCV) genotypes and their use 
I as therapeutic and diagnostic agents.; » 

The present invention relates to new nucleotide and amino acid sequences corresponding 
to the coding Region of a new type 2 subtype 2d, type-specific sequences corresponding to 
HCV type 3a, to new sequences corresponding to the coding region of a neNy subtype 3c, and 
to new sequences corresponding to the coding region of HCV type 4 and type 5 subtype 5a; 
a process for preparing them, and their use for diagnosis, prophylaxis and therapy. 

The tectiiiical problem underlying the present invention is to provide new type-specific 
sequences of the Core, the El , the E2, the NS3, the NS4 and the NS5 regions of HCV type 
4 and type 5, as well as of new variants of HCV types 2 arid 3. These new HCV sequences 
are useful to diagnose the presence of type 2 and/or type 3. and/or type 4 and/or type 5 HCV 
genotypes in a biological sample. Moreover, the availability of these new type-specific 
sequenced can increase the overall sensitivity of HCV detection and should also prove to be 
useful for therapeutic purposes. 

Hepatitis C viruses (HCV) have been found to be the major cause of non-A, non-B 
hepatitis. The sequences of cDNA clones covering the complete genome of several prototype 
isolates have been determined (Kato et al., 1990; Choo et aL, 1991; Okamoto et al., 1991; w 
Okamoto et al., 1992). Comparison of these isolates shows that the variability in nucleotide 
sequences can be used to distinguish at least 2 different genotypes, type 1 (HCV-1 and HCV- 
J) and type 2 (HC-J6 and HC-J8), with an average homology of about 68%. Within each 
type, at least two subtypes exist (e.g. represented by HCV-1 and HCV- J), having an average 
homology of about 79%. HCV genomes belonging to the same subtype show average 
homologies of more than 90% (Okamoto et al., 1992). However, the partial nucleotide 
sequence of the NS5 region of the HCV-T isolates showed at most 67% homology with the 
previously published sequences, indicating the existence of a yet another HCV type (Mori et 
al., 1992). Parts of the 5' untranslated region (UR), core, NS3, and NS5 regions of this type 
3 have been published, further establishing the similar evolutionary distances between the 3 
major genotypes and their subtypes (Chan et al., 1992). 

The identification of type 3 genotypes in clinical samples can be achieved by means of 
PCR with type-specific primers for the NS5 region. However, the degree to which this will 
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be successful is .arge ly dependent on sequence variabUhy and on fti vinjs IjKr ^ ' 

serum, therefore, routine PCR in ft. open reading frame, especially^ type 3 and the new 
type 4 and 5 described in .he present invention and/or group V (Cha et al., 1992) genotypes 
can be predicted to be unsuccessful. A new typing sy5tem (UFA), based on variation in the 
h.ghly conserved 5', UR, proved to be more useful because the 5 major HCV genotypes and 
then subtypes can be determined .(Stuyver e, al., 1993). The selection of high-tiier isolates 
enables to obtain PCR fragment, for cloning with on!y 2 primers, while nested PCR requires 
that 4 primers match the unknown sequences of the new iype 3, 4 and' 5 genotype 
, New sequences of th e 5' untranslated region (5 ' UR) have been listed by Bnkh e, al 
(1992). For some'of these, the El region has recently been described (Bukb e. al 1993) 
Isolates with similar sequences in the.S'OR to a group of isolates including DK12 and HK10 
described by Butt e, al. (J992) and E-bl to E-b8 described and classified as type 3 by Chan 
et al. (1991), have been.reported and described in the 5'UR, the carbo*yterminal part of El 
and m the NS5 region as group IV by Cha e. al. (.992; WO '92/ 19743). and have also been 
described in the o'UR for isolate BR56 and classified as type 3 by the inventors of this 
application (Stuyver et al., 1993). 

The'aim of the present invention is to provide new HCV nucleotide and amino acid 
sequences enabling the detection of HCV infection. 

Another aim of the present infection is to provide new nucleotide and amino acid HCV 
sequences enabling^ the classification of infected biological fluids into different serological 
groups unambiguously linked to types and subtypes at the genome level. 

Another aim of the present invention is to provide new nucleotide and amino acid HCV 
sequences ameliorating the overall HCV detection rate. 

Another aim of the present invention is tq provide new HCV sequences, useful for the 
design of HCV vaccine compositions. 

Another aim of the present invention is to provide a pharmaceutical composition consisting 
of antibodies raised against the polypeptides encoded by these new HCV sequences, for 
therapy or diagnosis. 

The present invention relates more particularly to a composition comprising or consisting 
of at least one polynucleic acid containing at least 5, and preferably 8 or more contiguous 
nucleotides selected from at least one of the following HCV sequences: 
- an HCV type 3 genomic sequence, more particularly in any of the following 
regions: 
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the region spanning positions 417 to 957 of the Core/El region of HCV 
subtype 3a, 

. - the region spanning positions 4664 to 4730 of the NS3 region of HCV type 

♦ - the region spanning positions 4892 to 5292 of 'the NS3/4 region ( of HCV 
' type 3, ( . , 

■ - the region spanning positions 8023 to 8235 of the NS5 region of the BR36 
subgroup of HCV subtype 3a, 
, - an HCV subtype 3c genomic sequence, 

more particularly tke coding regions of the above-specified regions; 

- an HCV subtype 2d genomic sequence, more particularly the coding region of HCV 
subtype 2d; » • ' 

- an HCV type '4 genomic sequence, more particularly the coding region, more particularly 
the coding region of subtypes 4a, 4e,,4f, 4g, 4h, 4i, and 4j, 

- an HCV type 5 genomic sequence, more particularly the coding region of HCV type 5, 
more particularly the regions encoding Core, El, E2, NS3, and NS4 

with said nucleotide numbering being with respect to the numbering of HCV nucleic acids 
as shown in Table 1, and with said polynucleic acids containing at least one nucleotide 
difference with known HCV (type I, .type 2, and type 3) polynucleic acid sequences in the 
above-indicated regions, or the complement thereof. 

It is to be noted that the nucleotide difference in the polynucleic acids of the invention may 
involve or not an amino acid difference in the corresponding amino acid sequences coded by 
said polynucleic acids. 

According to a preferred embodiment, the present invention relates to a composition 
comprising or containing at least one polynucleic acid encoding an HCV polyprotein, with 
said polynucleic acid containing at least 5, preferably at least 8 nucleotides corresponding to 
at least part of an HCV nucleotide sequence encoding an HCV polyprotein, and with said 
HCV polyprotein containing in its sequence at least one of the following amino acid residues: 
L7, Q43, M44, S60, R67, Q70, T71, A79, A87, N106, K115, A127, A190, S130, V134, 
G142, 1144, E152, A157, V158, P165, S177 or Y177, 1178, V180 or E180 or F182, R184, 
1186, H187, T189, A190, S191 or G191, Q192 or L192 or 1192 or V192 or E192, N193 or 
H193 or P193, W194 or Y194, H195, A197 or 1197 or V197 or T197, V202, 1203 or L203, 
Q208, A210, V212, F214, T216, R217 or D217 or E217 or V217, H218 or N218, H219 or 

SUBSTITUTE SHEET (RULE 26) 

BNSDOCID: <WO 9425601 A2_l_> 



WO 94/25601 

PCT/EP94/01323 

k 4 ' 

0235 " W ' M23 ' " °' mU ™ " ™* •«» V K232, 

Q235 or 1235. A237 or T237. 1242, 1246, S247, S248, V249, S250 or Y250, 125! or V25, 

or M251 or F251, D252, T254 or V254, L255 or V255, E256 or A256, M258 or F258 or 
V258, A260 or Q260 or S260, A26,, T264 or Y264, M265, I266,or A266, A267, G268 or 
| . T268, F271 or M271 or V271, 1277, M280 or H280, ,284 or A284 or L84 V274 V291 
N292 or S292, R293 or ,293 or Y293, Q294 or R294, L291 or .297 or Q297, A299 or K299 
or Q299, N303 or T303, T308 or L308, T310 or F3.0 or A310 or D3,0 or V310 L313 
,03.7 or Q317 , L33 3, S35., A358. A359, A363, S364, A366, 7369, L373, F376,' Q386 
1387, S392, ,399, F402, ,403, R405„D454, A46., A463, T464, K 484, Q500, E50. S52.' 
K522, H524, N528, S53,, S532, V534, F536, F537, M539, ,546, C.282, A.283, H.3.o' 
V.3.2. Q132.. P.368, V.372, V1373, K1405, Q,406, S.409, A.424, A,429 CHSs'' 
SU? 8,456, H.496, A,504, D,5,0, D,,529, ,1543, N1567, , D1 556, N.567 M.572' 
Q,579, U58,, S,583, F,585, V,595,; E,606 or T,606, M,6„, V,6,2 or L.612 P,63o' 
C,636,. P1 651, T.656 or ,,656, U663, V.667, V.677, A,68,, H.685. E.687, GI689' 
V.695, A.700, Q,704, Y,705, A17.3, A17,4 „rS,7,4, M1718, D,7,9, A.72, or T,72,' 
R1722, A1723 or V,723, H,726 or G,726, E1730, V1732. F,735, ,,736,' S1737 RTOg' 
T1739, 0,740, Q,74,, K,742, Q.743, A,744. T1745, L.746, B.747 or K1747 ,,749' 
A1750. T,75, or A,751, V,753, N.755, K,756, A.757, P,758, A,759, H,762,' T ,763' 
Y.764, P2645, A2647, K2650, K2653 or L2653, S2664. N2673, F2680, K268,, L26 86 ' 
H2692, Q2695 or L2695 or ,2695, V2712, F2715, V27.9 or Q2719, TC722, TO724 S2725 
R2726, G2729, Y2735, H2739, 12748, G2746 or ,2746, ,2748, P2752 or K2752 P2754 or 
T2754, T2757 or P2757, with .aid rata bei ng composed of a leer represent^' ,he ammo 
acd residue by i* one-leder code, and a number represeming me ammo acid numbering 
according to Kato et al., 1990. 

Each of the above-mentioned residues can be found in any of Figures 2 5 7 1 1 or 12 
showing the new amino acid sequences of the present invention aligned with known sequences 
of other types or subtypes of HCV for the Core, El, E2, NS3. NS4, and NS5 regions 

More particularly, a polynucleic acid contained in the composition according to the present 
invention contains at least 5, preferably 8, or more contiguous nucieotides corresponding to 
a sequence of contiguous nucleotides selected from at least one of HCV sequences encoding 
the following new HCV amino acid sequences: 

- new sequences spanning amino acid positions 1 to 319 of the Core/El region of HCV 
subtype 2d, type 3 (more particularly new sequences for subtypes 3a and 3c), new type 4 
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subtypes (more particularly new sequences forsubtypes 4a, 4e, 4f, 4g, 4h, 4i and 4j) and 
type 5a, as shown in Figure 5; 

- new sequences spanning amino acid positions 328 to 546 of the E1/E2 region of HCV 
subtype 5a as shown in Figiire 12; , ■ - 

- $ew sequences spanning amino acid positions 1556 to 1764 of the NS3/NS4 rpgion of 
HCV ♦ type 3 (more particularly fpr new subtypes 3a sequences), and subtype 5a, as shown 
in Figure 7 or 1 i ; , 

- new sequences spanning amino acid positions 2645 to 2757 of the NS5B region of HCV 
, subtype 2d, type 3 (more particularly for new, subtypes 3a and 3c), new type 4 subtypes 

(more particularly subtypes 4a, 4e, 4f, 4g, 4h, 4i and 4j) and subtype 5a, as shown in 
Figure 2, 1 

Using the LiPA system .mentioned above, Brazilian blood donors with high titer type 3 
hepatitis C viru$, Gabonese patients ' with high-titer type 4 hepatitis C 1 virus, and a Belgian 
patient with high-titer HCV type 5 infection were selected. Nucleotide sequences in the core, 
El, NS5 and NS4 regions which have not yet been reported before, were analyzed in the 
frame of the invention. Coding sequences (with the exception of the core region) of any type 
4 isolate are reported for the first time in the present invention. The NS5b region was also 
analyzed for the new type 3 isolates. After having determined the NS5b sequences, 
comparison with the Ta and Tb subtypes described by Mori et al. (1992) was possible, and 
the type 3. sequences could be identified as type 3a genotypes. The new type 4 isolates 
segregated into 10 subtypes, based on homologies obtained in the NS5 and El regions. New 
type 2 and 3 sequences coujd also be distinguished from previously described type 2 or 3 
subtypes from sera collected in Belgium and the Netherlands. 

The term "polynucleic acid" refers to a single stranded or double stranded nucleic acid 
sequence which may contain at least 5 contiguous nucleotides to the complete nucleotide 
sequence (f.i. at least 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 or more contiguous nucleotides). A 
polynucleic acid which is up till aboiit 100 nucleotides in length is often also referred to as 
an oligonucleotide. A polynucleic acid may consist of deoxyribonucleotides or 
ribonucleotides, nucleotide analogues or modified nucleotides, or may have been adapted for 
therapeutic purposes. A polynucleic acid may also comprise a double stranded cDNA clone 
which can be used for cloning purposes, or for in vivo therapy, or prophylaxis. 

The term "polynucleic acid composition " refers to any kind of composition comprising 
essentially said polynucleic acids. Said composition may be of a diagnostic or a therapeutic 
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The expression "nucleotides corresponding to" refers to nucleotides which are'homologous 
or complementary to an hidicated nucleotide sequence or region within a specific HCV 
sequence. 

The term "coding Region" corresponds to the region of the HCV genome that encodes the 
tiCV polyprotein. In fact, it comprises the complete genome with the exception of the 5' 
untranslated region and 3' untranslated region. 

. The term "HCV polyprotein" refers to the HCV polyprotein of the HCV-J isolate (Kato 
et al., 1990). The adenine residue at, position 330 (Kato et al., 1990) is the first residue of 
the ATG codon that initiates the long HCV polyprotein of 3010 amino acids in HCV-J and 
other type lb isolates, and of 3011 amino acids in HCV-1 and other type la isolates, and of 
303? amino acids in type 2 isolates HC-J6 and HC-J8 (Okamoto et al., 1992). 

This adenine is designated as position 1 at the nucleic acid level, and this methionine is 
designated as position 1 at the amino acid level, in the present invention. As type la isolates 
contain 1 extra amino acid in the NS5a region, coding sequences of type la and lb have 
identical numbering in the Core, El , NS3, and NS4 region, but will differ in the NS5b region 
as indicated in Table 1. Type 2 isolates have 4 extra amino acids in the E2 region, and 17 
or 18 extra amino acids in 

the NS5 region compared to type 1 isolates, and will differ in numbering from type 1 isolates 
in the NS3/4 region and NS5b regions as indicated in Table 1. 
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1 

) 


Region 


Positions 
described in 
the ' 

invention* 


Positions , 
described for 
HCV-J 
fKato et ad 
1990), 


Positions 
described, for 
HCV-1 
fChoo et al 
1991) 


Positions 
described for 
HC-J6, HC-J8 

al., 1992) ' 


Nucleotide 
s 


NS5b 


8023/8235 
7932/8271 


8352/8564 
8261/8600 


8026/8238 
7935/8274 


8433/8645 
8342/8681 


i 

r 


NS3/4 ' 


4664/5292 
4664/4730 
4892/5292 
3856/4209 
4936/5292 


4993/5621 
4993/5059 
5221/5621 
4185/4528 
5265/5621 


4664/5292 
4664/4730 
4892/5292 
3856/4209 
4936/5292 • 


5017/5645 
5017/5083 
5245/5645 
4209/4762 
5289/5645 i 






r*r\t\ \x\Ct 

region 
of present 
invention 




I/7UJJ 


J*TA/7*tJ7 ' 

1 


Amino K 
Acids 


NS5b 


2675/2745 
2645/2757 


2675/2745 
2645/2757 


2676/27 '46 
2646/2758 


2698/2768 
2668/2780 




NS3/4 


1556/1764 
1286/1403 
1646/1764 


1556/1764 
1286/1403 
1646/1764 


1556/1764 
1-286/1403 
1646/1764 


1560/1768 
1290/1407 
1650/1768 



Table 1 : Comparison of the HCV nucleotide and amino acid numbering system used in the 
present invention (*) with the numbering used for other prototype isolates. For 
example, 8352/8564 indicates the region designated by the numbering from 
nucleotide 8352 to nucleotide 8564 as described by Kato et al. (1990). Since the 
numbering system of the present invention starts at the polyprotein initiation site, 
the 329 nucleotides of the 5' untranslated region described by Kato et al. (1990) 
have to be substracted, and the corresponding region is numbered from nucleotide 
8023 (»'8352-329") to 8235 ("8564-329 M ). 
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* 7 " HCV ^ C °' reSPOndS *° "TV " HCV •»■•« <* which ,he complete 
g^ome shows more than 74% hotooloey „ nucldc M leve| _ or ^ ^ . ■ 

e^nudeoude positions 7932 Md 8271 shows more ^ ?4% 
sc.d level or of which the comp.ete HCV po.yprotein shows more than «H homology a. ». 
«. acd level, or pf which tite NS5 regie' between amino acid? at posiUom 2M5 ^ ^ 

^ ' ***** f *• to poiyproteina „r the other ^ 

,1 HCV P ' , ^ * - ATG coooo or 4, metitionine of the 

£S^ h 'a'?" iSOla ' e ^ eU '-' 1990> ^ - 
27l h0m ° i06,eS - ° VCT *» "«** ~ of -ess titan 74% a, the nucleic 

acd eve and leas than « a. «he amino acid ,eve.. Isolates bemnging * «. same ^ 
usuatiy show homologies of about 92,to 95,. te nucIeic acid J „ ^ * ^ 

h~ r: 1 b ,o *• sme sub - - *<- » — z 

sw T pr ^ rab,y show homo, ° 8ies of about 79% - •* - 

85-86% at the amino acid level. , , 

More preferably the definition of HCV types is conchtdeti front the classification of HCV 
■solates according to their nucleotide distances calculated as detailed below 

(1) based on phylogeneuc analysis of nucleic acid sequences in the NS5b region between 

z z r«r mi 82 r <ch °° * 1990 ° r 8261 ^ ^ «- - •*» 
s^™Xrs , " t, - ,,w ^-*-' 

■ i. , . >. , • ' ' md m0re usuaU >' of " 1<!ss «»» 0.32, and 

tso ates belonging » the aame subtype show nucleotide distances of less than 0..35, usually 
of less than 0,3, and moreusuatiy of leas than 0,25, and consequent* * be .onging ,o 
.he ante type bu, different subtypes show nucleotide distances tanging f ron „. 135 J 0 34 
usual y ranging front 0,3*4 „ 0.2477. and ntore usually ranging front 0,5 to 0.32, and 
tsolates belongrng to different HCV types show nuclide distances greater than 0.34, usually 
0.35, and ntore ttauatiy of greater titan 0.358, more usually ranging from 0.13S4 

(2) based on phylogenetic analysis of nucleic acid sequences in the cereal region b«ween 

nucleou es 37S and 957, iaola.es belonging to Ute sante HCV type show nucleotide diamnces 

of lesa dtan 0,8, usually of less than 0.37, and ntore usually of leas titan 0.3d4, aud isola^s ' 

beongtng to the sante sub W e ahow nucleotide distances of less than 0. ,7, uauaily of .ess dtan 
0. 6 a» d usua||y of lKS ^ „ ]5 moreusuaiiy ^ o ]35) ^ 

0,34, and consequeudy i S o lattS belonging te dte sante V pe bu, different subtypes show 
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nucleotide distances ranging from 0.15 to 0.38,. usually ranging from 0.16 to 0.37, and more 
usually ranging from 0.17 to 0.36, more usually ranging from 0.133 to 0:379, and isolates 
belonging to different HCV types show nucleotide distarices greater than 0.34, 0.35, 0.36, 
usually more than 0.365, and niore usually of greater than 0.37, 

(3) based on phylogenetic analysis of nucjleic acid sequences in the NS3/NS^ region 
between nucleotides 4664 and 5292»(Choo et al. ? 1991) or between nucleotides 4993 and 5621 
(Kato et al., 1990) or between nucleotides 5017 and 5645 (Okamoto et al., 1991), isolates 
belonging to the same HCV type show nucleotide distances of less than 0.35, usually of less 
th^n 0.34, and more usually of less than 0.33, and isolates belonging to the same subtype show 
nucleotide distances of less than 0.10, usually of less than 0.18, and more usually of less than 
0.17, and consequently isolates beloiiging to the same type but different subtypes show 
nucleotide distances ranging from 0.17 to 0.35, usually ranging from 0.18 to 0.34, and more 
usually ranging from 0.19 to 0.33, and isolates belonging to different HCV types show 
nucleotide distances greater than 0.33, usually greater than 0.34, and more usually of greater 
than 0.35. .' ' , ' ' 

Table 2 : Molecular evolutionary distances * 



Region 


Core/El 
579 bp 


El 
384 bp 


NS5B 
340 bp 


NS5B 
222 bp 


Isolates* 


0.0017 - 0.1347" ■ 
(0.0750 ± 0.0245) 


0.0026 - 0.2031 
(0.0969 + 0.0289) 


0.0003 - 0.1151 
(0.0637 + 0.0229) 


0.000 - 0.1323 
(0.0607 +. 0.0205) 


Subtypes* 


0.1330 - 0.3794 
(0.2786 ± 0.0363) 


0.1645 - 0.4869 
(0.3761 +. 0.0433) 


0.1384 - 0.2977 
(0.2219 + 0.0341) 


0.117 - 0.3538 
(0.2391 +. 0.0399) 


Types* 


0.3479-0.6306 
(0.4703 +. 0.0525) ' 


0.4309 - 0.9561 
(0.6308 ± 0.0928) 


0.3581 - 0.6670 
(0.4994 ± 0.0495) 


0.3457 - 0.7471 
(0.5295 ± 0.0627) 



* Figures created by the PHYLIP program DNADIST are expressed as minimum to 
maximum (average ±_ standard deviation). Phylogenetic distances for isolates belonging 
to the same subtype ('isolates'), to different subtypes of the same type ('subtypes'), and 
to different types ('types') are given. 

In a comparative phylogenetic analysis of available sequences, ranges of molecular 
evolutionary distances for different regions of the genome were calculated, based on 19,781 
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pairwise comparisons by means of the DNA DIST program of the phylogeny inference 
package PHYLIP version 3.5C (Felsenstein, 1?93). The results are shown in Table 2 and 
indicate that although the majority of distances obtained in each region fit with classification 
of a certain isolate, only the ranges obtained in the340bp NS5B-region are non-overlapping 
( , and therefor conclusive. However, ag was performed in the present'invention, it is preferable 
1 to obtain sequence information from at least 2 regions before final classification of a given 
isolate. , ■ ■ • 

, Designation of a number to the different types of HCV and HCV types nomenclature is 

based on chronological discovery of die different types. The numbering system used in the . 

present invention might still fluctuate 'according to international conventions or guidelines. For 

example, "type 4" might be changed into "type 5" or "type 6". 

The term, "subtype" corresponds to a group of HCV isolates of which the complete 
polyprotein shows a homology of more than 90 % both at the nucleic acid and amino acid 
levels, or of which the NS5 regipn between nucleotide positions 7932 and 8271* shows a 
homology of more than 90% at the nucleic acid level to the corresponding parts of the 
genomes pf the other isolates of the same group, with said numbering beginning with the 
adenine residue of the initiation codon of the HCV polyprotein. Isolates belonging to the same 
type but different subtypes of HCV show homologies of more than 74% at the nucleic acid 
level and of more than 78% at the amino acid level. 

The term "BR36 subgroup" refers to a group of type 3a HCV isolates (BR36, BR33 
BR34) that are 95 %, preferably 95.5 %, most preferably 96 % homologous to the sequences 
as represented in SEQ ID NO 1, 3, 5, 7, 9, 11 in the NS5b region from position 8023 to 
8235. 

It is fo be understood that extremely variable regions like the El, E2 and NS4 regions will 
exhibit lower homologies than the average homology of the complete genome of the 
polyprotein. 

Using these criteria, HCV isolates can be classified into at least 6 types. Several subtypes 
can clearly be distinguished in types 1, 2, 3 and 4 : la, lb, 2a, 2b, 2c, 2d, 3a, 3b, 4a, 4b, 
4c, 4d, 4e, 4f, 4g, 4h, 4i and 4j based on homologies of the 5' UR and coding regions 
including the part of NS5 between positions 7932 and 8271. An overview of most of the 
reported isolates and their proposed classification according to the typing system of the 
present invention as well as other proposed classifications is presented in Table 3. 
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HCV CLASSIFICATION 



i 



OKA- MORI ' NAKA GHA PROTOTYPE 
MOTO ' O ' 



la I I Pt GI ■ HCV-1, HCV-H, HC-J1 

lb H, II KI Gil HCV-J, HCV-BK, HCV-T, HC-JK1, HC- 

J4, HCV-CMNA 

lc , HC-G9 

2a m ill K2a Gm HC-J6 

2b IV IV ' K2b Gm , HC-J8 

2c ' ' S83, ARG6, ARG8, 110, T983 

2d ' NE92 ' , 

3a V V K3 GIV E-bl, Ta, BR36, BR33, HD10, NZL1 

3b VI K3 GIV HCV-TR, Tb , . 

3c , BE98 

4a ' 1 ■ ' , ' -, Z4. GB809-4 

4b - Zl 

4c GB116, GB358, GB215, Z6, Z7 

4d DK13 

4e GB809-2, CAM600, CAM736 

4f CAM622, CAM627 

4g GB549 

4h - GB438 

4i CAR4/1205 

4j CAR1/501 

4k EG29 

5a GV SA3, SA4, SA1, SA7, SA11, BE95 

6a HK1, HK2, HK3, HK4 



BNSDOCID: <WO 9425601 A2 \ > 



SUBSTITUTE SHEET (RULE 26) 



WO 94/25601 

' . , PCT/EP94/01323 
12 ' 

The term "complement" refers to a nucleotide sequence which is complementary to an 
indicated sequence and which is able to hybridize to the indicated sequences. 

The composition of the invention can comprise many combinations. By way of example, 
the composition of the invention can comprise: , ...... 

- two (or more) nucjeic acids from the same region or, 

- two nucleic acids (or more), respectively from different regions, for the same isolate or 
for different isolates, 

- or nucleic acids from the same regions and from at least two different regions (for the 
same isolate or for different isolates). 

The present invention relates more particularly to a polynucleic acid composition as defined 
above, wherein said polynucleic acid corresponds to a nucleotide sequence selected from any 
of the following HCV type, 3 genomic sequences: 

- an HCV genomic sequence having a homology of at least 67%, preferably more than 69% 
more preferably 71%, even more preferably more than 73 % | or most preferably more than 
76% to any of the sequences as represented in SEQ ID NO 13, 15, 17, 19, 21V 23, 25 or 
27 (HD10..BR36 or BR33 sequences) in the region spanning positions 417 to 957 of the 
Core/El region as shown in Figure 4; 

- an HCV genomic sequence having a homology of at least 65%, preferably more than 67%, 
preferably more than 69%, even preferably more than 70%, most preferably more than 
74% to any of the' sequences as represented in SEQ ID NO 13, 15, 17^ 19, 21, 23, 25 or 
27 (HD10, BR36 or BR33 sequences) in the region spanning positions 574 to 957 of the 
El region as shown in Figure 4; 

- an HCV genomic sequence as having a homology of at least 79 % , more preferably at least 
81%, most preferably more than 83% or more to any of the sequences as represented in 
SEQ ID NO 147 (representing positions 1 to 346 of the Core region of HVC type 3c, 
sequence BE98) in the region spanning positions 1 to 378 of the Core region as shown in 
Figure 3; 

- an HCV genomic sequence of HVC type 3a having a homology of at least 74%, more 
preferably at least 76%, most preferably more than 78% or more to any of the sequences 
as represented in SEQ ID NO 13, 15, 17, 19, 21, 23, 25 or 27 (HD10, BR36 or BR33 
sequences) in the region spanning positions 417 to 957 in the Core/El region as shown in 
Figure 4; 

- an HCV genomic sequence of HCV type 3a as having a homology of at least 74%, 
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preferably more than 76%, most preferably 78% or' more to* any of the sequences as 
represented in SEQ ID NO 13, 15, 17, 19/21, 23, 25 or 27 (HD10, BR36 or BR33 
sequences) in the region spanning positions 574 to 957 in the El region as shown in Figure 
4; ' . 

. - an HCV genomic sequence as having a homology of more th^n 73.5%, preferably more 
than 74 %, most preferably 75 % homology to the sequence as represented in SEQ- ID NO 
29 (HCC153' sequence) in the region spanning positions 4664 to 4730 of the NS3 region 

• as shown in figure 6; 

- an HCV genomic sequence having ■ a homology of more than 70%, preferably more than 
72%, most preferably more than i4% homology to any of the sequences as represented 
in SEQ ID NO 29, 31, 33, 35, 37 or 39 (HC&53, HD10, BR36 sequences) in the region 
sp&nningj positions 4892 to 5292 in the NS3/NS4 region as shown in Figure 6 or 10; 

- an HCV genomic sequence of the BR36 subgroup of HCV type 3a as having a homology 

. ' ■ i 

of more than 95%, preferably 95,5%, most preferably 96% homology to any of the 
sequences as represented in SEQ ID NO 5, 7, 1, 3, 9 or 11 (BR34, BR33, BR36 
\ sequehces) in the region spanning positions 8023 to 8235 of the NS5 region as shown in 
Figure 1; 

- an HCV genomic sequence of the BR36 subgroup of HCV type 3a as having a homology 
of more than 96%, preferably 96,5%, most preferably 97% homology to any of the 
sequences as represented in SEQ ID NO 5, 7, 1, 3, 9 or 11 (BR34, BR33, BR36 
sequences) in the region spanning positions 8023 to 8192 of the NS5B region as shown in 
Figure i; 

- an HCV genomic sequence of HCV type 3c being characterized as having a homology of 
more than 79%, more preferably more than 81 %, and most preferably more than 83% to 
the sequence as represented in SEQ ID NO 149 (BE98 sequence) in the region spanning 
positions 7932 to 8271 in the NS5B region as shown in Figure L 

Preferentially the above-mentioned genomic HCV sequences depict sequences from the 
coding regions of all the above-mentioned sequences. 

According to the nucleotide distance classification system (with said nucleotide distances 
being calculated as explained above), said sequences of said composition are selected from: 

- an HCV genomic sequence being characterized as having a nucleotide distance of less than 
0.44, preferably of less than 0.40, most preferably of less than 0.36 to any of the 
sequences as represented in SEQ ID NO 13, 15, 17, 19, 21, 23, 25 or 27 in the region 
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spanning positions 417 to 957 of the Core/H} region as shown in Figure 4; 

- an HCV genomic sequence being characterized having a nucleotide distance of less than 
0.53, preferably less than 0.49, most preferably of less than 0.45 to any of the sequences 
as represented in SEQ ID NO 19, 21, 23, 25 or 27 in the region spanning positions 574 
to 957 of the El region as shown in Figure 4; 

- an H.CV genomic sequence characterized having a nucleotide distance of less tnan 0 15 
preferably less than 0.1$, and most preferably less than 0.11 to any of the sequences as ' 
represented in SEQ ID NO 147 in the region spanning positions 1 to 378 of me Core 
region as shown in Figure 3; 

- an HCV genomic sequence' of HVC type 3a' being characterized as having a nucleotide 
distance of less than 0.3, preferably Ibss than 0.26, most preferably of less than 0.22 to 
any of the sequences as represented in SEQ ID NO 13, 15, 17, 19, 21, 23, 25 or 27 in the 
region spanning positions 417 to 957 in the Core/El region as shown in Figure 4- 

- an HCV genomic sequence of HCV type 3a being characterized as having a nucleotide 
distance of .less than 0.35, preferably less than 0.31, most preferably of less than 0.27 to 
any of the sequences as represented in SEQ ID NO 13, 15, 17, 19, 21, 23, 25 or 27 in the 
region spanning positions 574-to 957 in the El region as shown in Figure 4; 

- an HCV genomic sequence of the BR36 subgroup of HCV type 3a being characterized as 
having a nucleotide sequence of less than 0.0423, preferably less than 0.042, preferably 
less than 0.0362 to any of the sequences as represented in SEQ ID NO 5, 7, 1, 3, 9 or 11 
in the region spanning positions 8023 to 8235 of the NS5 region as shown'in Figure 1 • 

- an HCV genomic sequence of HCV type 3c being characterized as having a nucleotide 
distance of less than 0.255, preferably of less than 0.25, more preferably of less than 0.21, 
most preferably of less than 0. 17 to the sequence as represented in SEQ ID NO 149 in the 
region spanning positions 7932 to 8271 in the NS5B region as shown in Figure 1. 

In the present application, the El sequences encoding the antigenic ectodomain of the El 
protein, which does not overlap the carboxyterminal signal-anchor sequences of El disclosed 
by Cha et al. (1992; WO 92/19743), in addition to the NS4 epitope region, and a part of the 
NS5 region are disclosed for 4 different isolates: BR33, BR34, BR36, HCC153 and HD10 
all belonging to type 3a (SEQ ID NO 1, 3, 5 , 7, 9, 11, 13, 15, 17, 19, 21, 23, 25 , 27 2 9 ' 
31, 35, 37 or 39). ' 

Also within the present invention are new subtype 3c sequences (SEQ ID NO 147, 149 of 
the isolate BE98 in the Core and NS5 regions (see Figures 3 and 1). 
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Finally the present invention also relates to a new subtype 3a sequence as represented in 
SEQ ID NO 217 (see Figure 1) 

Also included within the present invention are sequence variants of the polynucleic acids 
as selected from any of the nucleotide sequences ^s given in any of the above mentioned SEQ 
ID pumbers, with said sequence variants containing either deletions and/or insertion^ of one 
or more* nucleotides, mainly at the extremities of oligonucleotides (either 3' or 5'), or 
substitutions of. some non-essential nucleotides by others (including modified nucleotides an/or 
inosine), for example, a type 1 or 2 sequence might be modified into a type 3 sequence by 
replacing some nucleotides of the type 1 or 2 sequence with type-specific nucleotides of type 
3 as shown in Figure 1 (NS5 region), Figure 3 (Core region), Figure 4 (Core/El region), 
Figure 6 and 10 (NS3/NS4 region). ' ' 

According to another embodiment, the present invention relates to a polynucleic acid 

composition as defined above, wherein said polynucleic acids correspond to a nucleotide 

i • 

sequence selected from any of the following HCV type 5 genomic sequences: 

- an HCV genomic sequence as having a homology of more than 85 % , preferably more than 
86%, most* preferably more than 87% homology to any of the sequences as represented 
in SEQ ID NO 41, 43, 45, 47, 49, 51, 53 (PC sequences) or 1$1 (BE95 sequence) in the 
region spanning positions 1 to 573 of the Core region as shown in Figure 9 and 3; 

- an HCV genomic sequence as having a homology of more than 61 % , preferably more than 
63%, more preferably more than 65% homology, even more preferably more than 66% 
homology and most preferably more than 67% homology (f.i. 69 and 71 %) to any of the 
sequences as represented in SEQ ID NO 41, 43, 45, 47, 49, 51, 53 (PC sequences), 153 
or 155 (BE95, BE100 sequences) in the region spanning positions 574 to 957 of the El 
region as shown in Figure 4; 

- an HCV genomic sequence having a homology of more than 76.5%, preferably of more 
than 77%, most preferably of more than 78% homology with any of the sequences as 
represented in SEQ ID NO 55, 57, 197 or 199 (PC sequences) in the region spanning 
positions 3856 to 4209 of the NS3 region as shown in Figure 6 or 10; 

- an HCV genomic sequence having a homology of more than 68 % , preferably of more than 
70%, most preferably of more than 72% homology with the sequence as represented in 
SEQ ID NO 157 (BE95 sequence) in the region spanning positions 980 to 1179 of the 
E1/E2 region as shown in Figure 13; 

- an HCV genomic sequence having a homology of more than 57%, preferably more than 
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59%, most preferably more than 61 % homology -to any of the sequences as represented 
in SEQ ID NO 59 or 61 (PC sequences) in the region spanning positions 4936 to 5296 of 
the NS4 region as shown in Figure 6 or 10; 

- an HCV genomic sequence as having a homology, of more than 93 % , preferably more than 
^ 93.5%, most preferably more than 94% homology to any of the sequences as represented 
; ; in SEQ lb NO 159 or 161 (BE95 or BE96 sequences) in the region spanning positions 

7932 to 8271 of the NS5B region as shown in Figure 1. 
, Preferentially the above-mentioned genomic HCV sequences depict sequences from the 
coding regions of all the above-mentioned sequences. 

According to the nucleotide distance classification system (with said nucleotide distances 
being calculated as explained above), said sequences of said composition are selected from: ' 

- a nucleotide distance of less than 0.53, preferably less than 0.51,, more preferably less than 
0.49 for the El region to the type 5 sequences depicted above; 

- a nucleotide distance of less than 0.3, preferably less than 0.28, more preferably of less 
than 0.26 for the Core region to the type 5 sequences depicted above; 

- a nucleotide distance of less than 0.072, preferably less than 0.071, more preferably less 
than 0.070 for the NS^B region to the type 5 sequences as depicted above. 

Isolates with similar sequences in the 5'UR to a group of isolates including SA1, SA3, and 
SA7 described in the 5'UR by Bukh et al. (1992), have been reported and described in the 
5'UR and NS5 region as group V by Cha et al. (1992; WO 92/19743). This group of isolates 
belongs to type 5a as described in the present invention (SEQ ID NO 41, 43, 45, 47, 49, 51 
53, 55, 57, 59, 61, 151, 153, 155, 157, 159, 161, 197 and 199). 

Also included within the present invention are sequence variants of the polynucleic acids 
as selected from any of the nucleotide sequences as given in any of the above given SEQ ID 
numbers with said sequence variants containing either deletion and/or insertions of one or 
more nucleotides, mainly at the extremities of oligonucleotides (either 3' or 5'), or 
substitutions of some non-essential nucleotides (i.e. nucleotides not essential to discriminate 
between different genotypes of HCV) by others (including modified nucleotides an/or 
inosine), for example, a type 1 or 2 sequence might be modified into a type 5 sequence by 
replacing some nucleotides of the type 1 or 2 sequence with type-specific nucleotides of type 
5 as shown in Figure 3 (Core region), Figure 4 (Core/El region), Figure 10 (NS3 / NS4 
region), Figure 14,(E1/E2 region). 
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Another group of isolates including BU74 and BU79 having similar sequences in the 5'UR 
to isolates including Z6 and Z7 as described in the 5'UR by Bukh et aL (1992), have been 
described in the 5'UR and ^classified as a new type 4 by the inventors of this application 
(Stuyver et al., 1993). Coding sequences, including core, El and NS5 sequences of several 
^ . new Gabonese isolates belonging to this group, are disclosed in toe present invention (SEQ 
. ID NO 106, 108, 110, 112, 114, 116, 118, 120 and 122). ' . 

According to yet another embodiment, the present invention relates to a composition as 
'defined above, wherein said poly nucleic acids correspond to a nucleotide sequence selected 
i from any of the following HCV type ( 4 genomic sequences: 

- an HCV genomic sequence having a homology of mote than 66%, preferably more than 
68%, most preferably more than 70% homology in the El region spanning positions 574 
tp 957 to any of the sequences as represented in SEQ ID NO 118, 120 or 122 (GB358, 
GB549, GB809 sequences) as shown in Figure 4; ' 

- an HCV genomic sequence having a homology of more than 71 %, preferably more than 
72%, most preferably more than 74% homology to any of the sequences as represented 

\ in SEQ ID NO 1 18, 120 or 122 (GB358, GB549, GB809 sequences) in the region spanning 
position^ 379 to 957 of the El region as shown in Figure 4; 

- an HCV genomic sequence having a homology of more than 92%, preferably more than. 
93%, most preferably more than 94% homology to any of the sequences as represented 
in SEQ ID NO 163 or 165 (GB809, CAM600 sequences) in the region spanning positions 
1 to 378 of the Core/El region as shown in Figure 4; 

- an HCV genomic sequence (subtype 4c) having a homology of more than 85 % , preferably 
more than 86%, more preferably more than 86.5% homology, most preferably more than 
87, more than 88 or more than 89% homology to any of the sequences as represented in 
SEQ ID NO 183, 185 or 187 (GB116, GB215, GB809 sequences) in the region spanning 
positions 379 to 957 of the El region as shown in Figure 4; 

- an HCV genomic sequence (subtype 4a) having a homology of more than 81 %, preferably 
more than 83 %, most preferably more than 85% homology to the sequence as represented 
in SEQ ID NO 189 (GB908 sequence) in the region spanning positions 379 to 957 of the 
El region as shown in Figure 4; 

• an HCV genomic sequence (subtype 4e) having a homology of more than 85 %, preferably 
more than 87%, most preferably more than 89% homology to any of the sequences as 
represented in SEQ ID NO 167 or 169 (CAM600, GB908 sequences) in the region 
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spanning positions 379 to 957 of the El region as shown in Figure 4; 

- an HCV genomic sequence (subtype 4f) having a homology of more than 79%, preferably 
more, than 81 % , most preferably more than 83 % homology to any of the sequences as 
represented in SEQ ID NO 171 or 173 (CAMG22, CAMG27 sequences) in the region 
spanning positions 379 to '957 of the El region as shown in Figure 4; 

- an HCV genomic sequence (subtype 4g) having a homology of more than 84%, preferably 
more than 86%, most preferably more than 88% homology to the sequence as represented 
in SEQ ID NO 175 (GB549* sequence) in the region spanning positions 379 to 957 of the 

, El region as shown in Figure 4; , 

- an HCV genomic sequence (subtype 4h) having a homology of more than 83 %, preferably 
more than 85%, most preferably more' than 87% homology to the sequence as represented 
in SEQ ID NO 177 (GB438 sequence) in the region spanning positions 379 to 957 of the 
El region as 'shown in Figure 4; ' • 

- an HCV genomic sequence (subtype 4i) as having a homology of more than 76%, 
preferably more than 78%, most preferably more than 80% homology to the'sequence as 
represented in SEQ ID NO 179 (CAR4/1205 sequence) in the region spanning positions 
379 to 957 of the El region as shown in Figure 4; 

- an HCV genomic sequence (subtype 4j?) having a homology of more than 84%, preferably 
more than 86 % , most preferably more than 88 % homology to the sequence as represented 
in SEQ.ID NO 181 (CAR4/901 sequence) in the region spanning positions 379 to 957 of 
the El region as shown in figure 4; 

- an HCV genomic sequence as having a homology of more than 73 % , preferably more than 
75%, most preferably more than 77% homology to any of the sequences as represented 
in SEQ ID NO 106, 108, 110, 112, 114, or 116 (GB48, GB116, GB215, GB358, GB549, 
GB809 sequences) in the region spanning positions 7932 to 8271 of the NS5 region as 
shown in figure 1 ; 

- an HCV genomic sequence (subtype 4c) having a homology of more than 88 % , preferably 
more than 89%, most preferably more than 90% homology to any of the sequences as 
represented in SEQ ID NO 106, 108, 110, or 112 (GB48, GB116, GB215, GB358 
sequences) in the region spanning positions 7932 to 8271 of the NS5 region as shown in 
Figure 1; 

- an HCV genomic sequence (subtype 4e) having a homology of more than 88 % , preferably 
more than 89%, most preferably more than 90% homology to any of the sequences as 
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represented in SEQ ID NO 116 or 201 (GB809 or CAM 600 sequences) in the region ; 
spanning positions 7932 to 8271 of the NS5 region as shown in Figure 1; 

- an HCV genomic sequence (subtype 4f) having a homdlogy of more than 87%, preferably 
more than 89%, most preferably more th^n 9Q% homolpgy to the sequence as represented 
4n SEQ ID NO 203 (CAMG22 sequence) ,in the region spanning positions 7932 to 8271 
of the NS5' region as shown in Figure 1; 

- an HCV genomic sequence (subtype 4g) as having a homology of more th^n 85%, 
preferably more than 87%, most preferably more than 89% homology to the sequence as 

, represented in SEQ ID NO 114 (GB549 sequence) in the region spanning positions 7932 
to 8271 of the foS5 regiQn as shown in Figure 1; 

- an HCV genomic sequence (subtype 4h) as having a homology of more than 86%, 
preferably more than 87%, more preferably ( 'more than 88% homology, more preferably 
more than 89% homology to the sequence as represented in SEQ ID NO 207 (GB437 

; sequence) in the region spanning positions 7932 to 8271 of the NS5 region as shown in 
• Figure 1; ' 

-• an HCV genomic sequence (subtype 4i) having a homology of more than 84%, preferably 
more than 86%, most preferably more than '88% homology to fhe sequence as represented 
in SEQ ID NO 209 (CAR4/ 1205 sequence) in the region spanning positions 7932 to 8271 
of the NS5 region as shown in figure 1; 

- an HCV genomic sequence (subtype 4j) having a homology of more than 81 %, preferably 
more than 83%, most preferably more than 85% homology to the sequence as represented 
in SEQ ID NO 211 (CAR1/501 sequence) in the region spanning positions 7932 to 8271 
of the NS5 region as shown in figure 1 . 

Preferentially the above-mentioned genomic HCV sequences depict sequences from the 
coding regions of all the above-mentioned sequences. 

According to the nucleotide distance classification system (with said nucleotide distances 
being calculated as explained above); said sequences of said composition are selected from: 

- an HCV genomic sequence (type 4) being characterized as having a nucleotide distance of 
less than 0.52, 0.50, 0.4880, 0.46, 0,44, 0.43 or most preferably less than 0.42 in the 
region spanning positions 574 to 957 to any of the sequences as represented in SEQ ID NO 
118, 120 or 122 in the region spanning positions 1 to 957 of the Core/El region as shown 
in Figure 4; 

- an HCV genomic sequence (type 4) being characterized as having a nucleotide distance of 
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less than 0,39, 0.36 0.34 0.32 or most preferably less than 0.31 to any of the sequences 
as represented in SEQ ID NO 118, 120 or 122 in the region spanning positions 379 to 957 
of the El region as shoWn in Figure 4; 

- an HCV genomic sequence (subtype 4c) being characterized as having a nucleotide distance 
of less than 0.27, 0.26, 0.24, 0.22, 0.20, 0.18, 0.17, 0.162, 0.16 or most preferably less 

■ than 0: 15 to any of the sequences as represented in SEQ ID NO 183, 185 or 187 in the 
region spanning positions 379 to 957 of the El region as shown in Figure 4; 

- an HCV gehomic sequence (subtype 4a) being characterized as having a nucleotide distance 
of less than 0.30, 0.28, 0.26, 0.24, 0.22, 0.21 or most preferably of less than 0.205 to the 
sequence as represented in SEQ ID NO 189 in the region spanning positions 379 to 957 
of the El t region as shown in Figure 4; 

- an HCV genomic sequence (subtype 4e) being characterized as having a nucleotide distance 
of less than 0.26, 0.25, 0.23, 0.21, 0.19, 0.17, 0.165, most preferably less than 0.16 to 
any of the sequences as represented in SEQ ID NO 167 or 169 in the region' spanning 
positions 379 to 957 of the E J region as shown in Figure 4; 1 

- an HCV genomic sequence (subtype 4f) being characterized as having a nucleotide distance 
of lessUan 0.26, 0.24', 0.22, 0.20, 0.18, 0.16, 0,15 or most preferably less than 0.14 to 
any of the sequences as represented in SEQ ID NO 171 or 173 in the region spanning 
positions 379 to 957 of the El region as shown in Figure 4; 

- an HCV genomic sequence (subtype 4g) being characterized as having a nucleotide 
distance of less than 0.20, 0.19, 0.18, 0.17 or most preferably of less than 0.16 to the 
sequence as represented in SEQ ID NO 175 in the region spanning positions 379 to 957 
of the El region as shown in Figure 4; 

- an HCV genomic sequence (subtype 4h) being characterized as having a nucleotide 
distance of less than 0.20, 0.19, 0.18, 0.17 and most preferably of less than 0.16 to the 
sequence as represented in SEQ ID NO 177 in the region spanning positions 379 to 957 
of the El region as shown in Figure 4; 

- an HCV genomic sequence (subtype 4i) being characterized as having a nucleotide distance 
of less than 0.27, 0.25, 0.23, 0.21 and preferably less than 0.16 to the sequence as 
represented in SEQ ID NO 179 in the region spanning positions 379 to 957 of the El 
region as shown in Figure 4; 

- an HCV genomic sequence (subtype 4j?) being characterized as having a nucleotide 
distance of less than 0. 19, 0. 18, 0. 17, 0. 165 and most preferably of less than 0. 16 to the 
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sequence as represented in SEQ ID NO 181 in the region spanning positions 379 to 957 
of the El region as shown in figure 4; 

- an HCV genomic sequence (type 4) being characterized as having a nucleotide distance of 
less than 0.35, 6.34, 0.32 ahd most preferably of less than 0:30 to any of the sequences 
as represented in S.EQ ID NO 106, 108, \ 10, 112, 114, or 116 in the region ^panning 
positions 7932 to 8271 of the NS5 region ap shown in figure 1; 

- an HCV genomic sequence (subtype 4c) being characterized as having a nucleotide distance 
of less than 0. 18,, 0. 16, 0. li, 0. 135, 0. 13, 0. 12y5 or most preferably less than 0. 125 to 

, any of the sequences as represented in SEQ ID NO 106, 108, 110, or 112 in the region 
spanning positions 7932 to 8271 of the NS5 region as shown in Figure 1; 

- an HCV genomic sequence (subtype 4e) being characterized as having a nucleotide distance 
of less than 0. 15, 0.14», 0. 135, 0: 13 and most preferably of less than' 0. 125 to any of the 
sequences as Represented in SEQ ID NO 116 or 201 in the region spanning positions 7932 
to 8271 of the NS5 region as shown jn Figure 1; 

- an HCV geiiomic sequence (subtype 4f) being characterized as having a nucleotide distance 
of less than (1 0. 15, 0. 14, 0. 135, 0. 13 or most preferably less than 0. 125 to the sequence as, 

• represented in SEQ ID NO 203 in the regioi spanning positions 7932 to 8271 of the NS5 
region as shoWn in Figure 1; 

- an HCV genomic sequence (subtype 4g) being characterized as having a nucleotide 
distance of less than 0.17, 0.16, 0.15, 0.14, 0.13 or most preferably less than 0.125 to the 
sequence as represented in SEQ ID NO 1 14 in the region spanning positions 7932 to 8271 
of the NS5 region as shown in Figure 1; 

- an HCV genomic sequence (subtype 4h) being characterized as having a nucleotide 
distance of less than 0.155, 0.15, 0.145, 0.14, 0.135, 0.13 or most preferably less than 
0.125 to the sequence as represented in SEQ ID NO 207 in the region spanning positions 
7932 to 8271 of the NS5 region as shown in Figure 1; 

- an HCV genomic sequence (subtype 4i) being characterized as having a nucleotide distance 
of less than 0.17, 0.16, 0.15, 0.14, 0.13 or most preferably of less than 0.125 to the 
sequence as represented in SEQ ID NO 209 in the region spanning positions 7932 to 8271 
of the NS5 region as shown in figure 1; 

- an HCV genomic sequence (subtype 4j) being characterized as having a nucleotide distance 
of less than 0.21, 0.20, 0.19, 0.18, 0.17, 0.16, 0.15, 0.14, 0.13 and most preferably of 
less than 0.125 to the sequence as represented in SEQ ID NO 211 in the region spanning 
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positions 7932 to 8271 of the NS5 region as shown in figure' 1. • • 
Also included within the present invention are sequence variants of the polynucleic acids 
as selected from any of the nucleotide sequences as given in any of the above given SEQ ID 
numbers with said sequence variants containing either deletion and/or insertions of one or 
( . more nucleotides, mainly at the extremities of oligonucleotides (either 3' or 5'), or 
• substitutions of some non-essential nucleotides (i.e. nucleotides not essential to discriminate 
between different genotypes of HCV) by others (including modified nucleotides an/or 
1 inosine), for example, a type 1 or 2 sequence might be modified into a type 4 sequence by 
' replacing some nucleotides of the type 1 or 2 sequence with type-specific nucleotides of type 
4 as shown in Figure 3 (Gore region), Figure 4 (Core^El region), Figure 10 (NS3 / NS4 
region), Figure 14 (E1/E2 region). 

The present invention also relates to a sequence as represented in SEQ ID NO 193 (GB724 
sequence). ' ■' * 

After aligning NS5 or El sequences of GB48, GB, 116, GB215, GB358, GB549 and 
GB809, these isolates clearly sbgregated into 3 subtypes within type 4 : GB48, GB116, 
\ GB215 and GB358 belong to the sybtype designated 4c, GB549 to subtype 4g and GB809 to 
subtype 4el In NS5, GB809 (subtype 4e) showed a higher nucleic acids homology to subtype 
4c isolates (85.6 - 86.8%) than to GB549 (subtype 4g, 79.7%), while GB549 showed similar 
homologies to both other subtypes (78.8 to 80% to subtype 4c and 79.7% to subtype 4e). In 
El, subtype 4c showed equal nucleic acid homologies of 75.2% to subtypes 4g and 4e while 
4g and 4e were 78.4% homologous. At the amino acid level however, subtype 4e showed a 
normal homology to subtype 4c (80.2%), while subtype 4g was more homologous to 4c 
(83.3%) and 4e (84.1%). 

According to yet another embodiment, the present invention relates to a composition as 
defined above, wherein said polynucleic acids correspond to a nucleotide sequence selected 
from any of the following HCV type 2d genomic sequences: 

- an HCV genomic sequence as having a homology of more than 78 % , preferably more than 
80%, most preferably more than 82% homology to the sequence as Represented in SEQ 
ID NO (NE92) 143 in the region spanning positions 379 to 957 of the Core/El region as 
shown in Figure 4; 

- an HCV genomic sequence as having a homology of more than 74% , preferably more than 
76%, most preferably more than 78% homology to the sequence as represented in SEQ 
ID NO 143 (NE92) in the region spanning positions 574 to 957 as shown in Figure 4; 
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- an HCV genomic sequence as having a homology of more than 87 % preferably more than 
89%, most preferably more than 91% homology to the sequence as represented in SEQ 
ID NO 145 (NE92) in th? region spanning positions 7932 to 8271 of the NS5B region as 
shown in Figure 1 . i 

^ . Preferentially the (above-mentioned genomic HCV sequence^ depict sequences from the 
J coding regions of all the above-mentioned sequences. • 

According to the nucleotide distance classification system (with said nucleotide distances 
being calculated as explained above), said sequences of said composition are selected from: 
i - a nucleotide distance of less than 0.32, preferably less than 03 1 , more preferably less than 
0.30 for the El region (574 to 95*7) to any of the above specified sequences; 

- a nucleotide distance of less than 0.08, preferably less than 0.07, more preferably less than 
0,06 fot the Core region (1 to 378) to any of the above given sequences 

- a nucleotide distance of less thin 0.15, preferantially less than 0.13, more preferentially 

less than 0.12 for the NS5B region to any of the above 7 specified sequences. 

i * 

Poly nucleic acid sequences according to the present invention which are homologous to the 
\ sequences as represented by a SEQ ID NO can be characterized and isolated according to any 
of the techniques known in the art, such as amplification by means of type or subtype specific 
primers, hybridization with type or subtype specific probes under more or less stringent 
conditions, serological screening methods (see example? 4 and 11) or via the LiPA typing 
system. 

Polynucleic acid sequences of the genomes indicated above from regions not yet depicted 
in the present examples, figures and sequence listing can be obtained by any of the techniques 
known in the art, such as amplification techniques using suitable primers from the type or 
subtype! specific sequences of the present invention. 

The present invention relates also to a composition as defined above, wherein said 
polynucleic acid is liable to act as a primer for amplifying the nucleic acid of a certain isolate 
belonging to the genotype from which the primer is derived. 

An example of a primer according to this embodiment of the invention is HCPr 152 as 
shown in table 7 (SEQ ID NO 79). 

The term "primer" refers to a single stranded DNA oligonucleotide sequence capable of 
acting as a point of initiation for synthesis of a primer extension product which is 
complementary to the nucleic acid strand to be copied. The length and the sequence of the 
primer must be such that they allow to prime the synthesis of the extension products. 
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Preferably the primer is about 5-50 nucleotides. Specific length and' sequence will depend on , 
the complexity of the required DNA or RNA targets, as well as on the conditions of primer ' 
use such as temperature and ionic strength. , 

The fact that amplification primers do not ( have ,to match exactly with corresponding 
template sequence to warrant proper amplification is amply documented in the literature 

(Kwok.etal., 1990). 

: ' ' i • 
The amplification method used can be either polymerase chain reaction (PCR; Saiki et al ' 

1988), ligase chain reaction (LCR; Landgren et al., 1988; Wu & Wallace, 1989; Barany, 

1991), nucleic acid sequence-based amplification (NASBA; Guatelli et al., 1990; Compton, 

1991), transcription-based amplification system (TAS; Kwoh et al., 1989), strand 

displacement amplification (SDA; Duck, 1990; Walker et al., 1992) or' amplification by 

means of QB replicase (Ljzardi et al. , 1988; Lomeli et al. , 1989) or any other suitable method 

to amplify nucleic acid molecules using primer extension. During amplification, the amplified 

products can be conveniently labelled either using labelled primers or by incorporating 

labelled nucleotides. Labels may be isotopic ("P, 3S g, etc.) or non-isotopic (biotin, 

digoxigenin,, etc.). The amplification reaction is repeated between 20 and 80 times 

advantageously between 30 and 50 times. 

The present invention also relates to a composition as defined above, wherein said 
polynucleic acid is able to act as a hybridization probe for specific detection and/or 
classification into types of a nucleic acid containing said nucleotide sequence, with said 
oligonucleotide being possibly labelled or attached to a solid substrate. 

The term "probe" refers to single stranded sequence-specific oligonucleotides which have 
a sequence which is complementary to the target sequence of the HCV genotype(s) to be 
detected. 

Preferably, these probes are about 5 to 50 nucleotides long, more preferably from about 
10 to 25 nucleotides. 

The term "solid support" can refer to any substrate to which an oligonucleotide probe can 
be coupled, provided that it retains its hybridization characteristics and provided that the 
background level of hybridization remains low. Usually the solid substrate will be a microtiter 
plate, a membrane (e.g. nylon or nitrocellulose) or a microsphere (bead). Prior to application 
to the membrane or fixation it may be convenient to modify the nucleic acid probe in order 
to facilitate fixation or improve the hybridization efficiency. Such modifications may 
encompass homopolymer tailing, coupling with different reactive groups such as aliphatic 
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groups, NH 2 groups, SH groups, carboxylic groups, or coupling with biotin or haptens. 

The present invention also relates to the use of a composition as defined above for 
detecting the presence of one or more HCV genotypes, inore particularly for detecting the 
presence of a nucleic acid of iny of the H£V genotypes , having a nucleotide sequence as 
defined above, present in a biological sample Jiable to contain them, comprising at , least the 
following stepS: " . ' , 

(i) possibly extracting sample nucleic acid, , 

i 1 

(ii) possibly amplifying the nucleic acid with at least one of the primers as defined 

above or any other HCV subtype 2d, HCV type 3, HCV type 4, HCV type 5 

or universal HCV primer, 

i ■ ■ ■ ' ■ 

(iii) hybrizing the nucleic acids 1 of the biological sample, possibly under denatured 

conditions, and with said nucleic acids being possibly labelled during or after 
amplification, at appropriate conditions with one or more probes as defined above, 
with said probes being preferably attached to a solid substrate, 

(iv) washing at appropriate conditions, 

(v) dete<j;tin£ the hybrids formed, 

(vi) inferring the presence df one or mote HCV genotypes present from the observed 
hybridization pattern. 

Preferably, this technique could be; performed in the Core or NS5B region. 

The term "nucleic acid" can also be referred to as analyte strand and corresponds to a 
single- or double-stranded nucleic acid molecule. This analyte strand is preferentially positive- 
or negative stranded RN A, qDNA or amplified cDNA. 

The term "biological sample" refers to any biological sample (tissue or fluid) containing 
HCV nucleic acid sequences and refers more particularly to blood serum or plasma samples. 

The term "HCV subtype 2d primer" refers to a primer which specifically amplifies HCV 
subtype 2d sequences present in a sample (see Examples section and figures). 

The term "HCV type 3 primer" refers to a primer which specifically amplifies HCV type 

3 sequences present in a sample (see Examples section and figures). 

The term "HCV type 4 primer" refers to a primer which specifically amplifies HCV type 

4 genomes present in a sample. 

The term "universal HCV primer" refers to oligonucleotide sequences complementary to 
any of the conserved regions of the HCV genome. 

The term "HCV type 5 primer" refers to a primer which specifically amplifies HCV type 
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5 genomes present in a sample. The term "universal HCV primer" refers to oligonucleotide 
sequences complementary to any of the conserved regions of the HCV genome. 

The expression "appropriate" hybridization and washing conditions are to be understood 

1 i 

as stringent and are generally known in the art (e.g. Maniatis et al., Molecular Cloning: A 
Laboratory Manual, New York, Cold Spring Harbor Laboratory, 1982). 
; However, according to the hybridization solution (SSC, SSPE, etc.), these probes should 
be hybridized, at their appropriate temperature in order to attain sufficient specificity. 
, The term labelled" refers to the use of labelled nucleic acids. This may include the use 
of labelled nucleotides incorporated during the polymerase step of the amplification such as . 
illustrated by Saiki et al. (1988) 01^ et al. (1990) or labelled primers, or by any other 
method kriqwn to the person skilled in the art. 

The process of the invention comprises the steps of contacting any of the probes as defined 
above, with one of the following .elements: , 

either a biological sample in which the nucleic acids are made available for 
hybridization, , 1 

. ( or the purified nucleic acids contained in the biological sample ' 
s or a single copy derived from the purified nlucleic acids, 

or an amplified copy derived from the purified nucleic acids, with said elements or 

with said probes being attached to a solid substrate. 
The expression "inferring the presence of one or more HCV genotypes present from the 
observed hybridization pattern " refers to the identification of the presence of HCV genomes 
in the sample by analyzing the pattern of binding of a panel of oligonucleotide probes. Single 
probes may provide useful information concerning the presence or absence of HCV genomes 
in a sample. On the other hand, the variation of the HCV genomes is dispersed in nature, so 
rarely is any one probe able to identify uniquely a specific HCV genome. Rather, the identity 
of an HCV genotype may be inferred from the pattern of binding of a panel of 
oligonucleotide probes, which are specific for (different) segments of the different HCV 
genomes. Depending on the choice of these oligonucleotide probes, each known HCV 
genotype will correspond to a specific hybridization pattern upon use of a specific 
combination of probes. Each HCV genotype will also be able to be discriminated from any 
other HCV genotype amplified with the same primers depending on the choice of the 
oligonucleotide probes. Comparison of the generated pattern of positively hybridizing probes 
for a sample containing one or more unkown HCV sequences to a scheme of expected 
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hybridization patterns, allows one to clearly infer the HCV genotypes present in said sample. 

The present invention thus relates to a method as defined above, wherein one or more 
hybridization probes are selected from any of SEQ ID NO 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 
21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45', 47, 49, 51, 53, 55, 57, 59 or 61, 106, 
{ . 108, 110, 112, 114, 116, 118, 120, 122, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 
; 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 198, 191, 193, 195, 197, 
199, 201, 203, 205, 207, 209^ 211, 213, 215, 217, 222, 269 or sequence variants thereof, 
with said sequence variants containing deletions and/or insertions of one or qiore nucleotides, 
mainly at their extremities (either 3' or 5 '), or substitutions of some non-essential nucleotides 
(i.e. nucleotides not essential to discriminate between genotypes) by others (including 
modified nudeotides or inosine), or with said variants consisting of the complement of any 
of the atjoVe-mentioned oligonucleotide probes, or with said variants consisting of 
ribonucleotides instead of deoxyribonucleotides, all provided that said variant probes can be 
caused to hybridize with the same specificity as the oligonucleotide probes from which they 
are derived. 

\ In orcter to distinguish the amplified HCV genomes from each other, the target pblynucleic 
acids are hybridized to a set of sequence-specific DNA probes targetting HCV genotypic 
regions located in the HCV polynucleic acids. 

Most of these probes target the most type-specific regions of HCV genotypes, but some 
can be caused to hybridize to more than one HCV genotype. 

According to the hybridization solution (SSC, SSPE, etc.), these probes should be 
stringently hybridized at their appropriate temperature in order to attain sufficient specificity. 
However, by slightly modifying the DNA probes, either by adding or deleting one or a few 
nucleotides at their extremities (either 3* or 5'), or substituting some non-essential nucleotides 
(i.e. nucleotides not essential to discriminate between types) by others (including modified 
nucleotides or inosine) these probes or variants thereof can be caused to hybridize specifically 
at the same hybridization conditions (i.e. the same temperature and the same hybridization 
solution). Also changing the amount (concentration) of probe used may be beneficial to obtain 
more specific hybridization results. It should be noted in this context, that probes of the same 
length, regardless of their GC content, will hybridize specifically at approximately the same 
tempefature in TMAC1 solutions (Jacobs et al., 1988). 

Suitable assay methods for purposes of the present invention to detect hybrids formed 
between the oligonucleotide probes and the nucleic acid sequences in a sample may comprise 
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any of the assay formats known in the art,, such as the conventional dot-blot format 
sandwich hybridization or. reverse hybridization. For example, the detection can be 
accomplished using a dot blot format, the unlabelled amplified sample being bound to a 
membrane, the membrane being incorporated wjth at least one labelled probe under suitable 
hybridization and wash, conditions, and the presence of bound probe being monitored. 

An alternative and preferred method is a "reverse" dot-blot format, in which the amplified 
sequence contains a label. In this format, the unlabelled oligonucleotide probes are fcound to 
a solid support and exposed to' the labelled sample under appropriate stringent hybridization 
apd subsequent washing conditions. It is to be understood that also any other assay method 
which relies on thb formatipn 'of a hybrid between the nucleic acids of the sample and the 
oligonucleotide probes according to the present invention may be used. 

According to an advantageous embodiment, , the process of detecting one or more HCV 
genotypes contained in a biological sample comprises the steps of contkcting amplified HCV 
nucleic acid copies derived from the biological sample, with oligonucleotide probes which 
have been imihobilized as parallel lines on a solid support. ' 

According, to this advantageous method, the probes are immobilized in a Line Probe Assay 
(LiPA) format. This is a reverse hybridization format (Saiki et-dl., 1989) using membrane 
strips onto which several oligonucleotide probes (including negative or positive control 
oligonucleotides) can be conveniently applied as parallel lines. 

The invention thus also relates to a solid support, preferably a membrane strip, carrying 
on its surface, one or more prqbes as defined above, coupled to the support in the form of 
parallel lines. 

The LiPA is a very rapid and user-friendly hybridization test. Results can be read 4 h. 
after the start of the amplification. After amplification during which usually a non-isotopic 
label is incorporated in the amplified product, and alkaline denaturation, the amplified product 
is contacted with the probes on the membrane and the hybridization is carried out for about 
1 to 1,5 h hybridized polynucleic acid is detected. From the hybridization pattern generated, 
the HCV type can be deduced either visually, but preferably using dedicated software. The 
LiPA format is completely compatible with commercially available scanning devices, thus 
rendering automatic interpretation of die results very reliable. All those advantages make the 
LiPA format liable for the use of HCV detection in a routine setting. The LiPA format should 
be particularly advantageous for detecting the presence of different HCV genotypes. 

The present invention also relates to a method for detecting and identifying novel HCV 
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genotypes, different from the known HCV genomes > comprising the steps of: 

determining to which HCV genotype the nucleotides present in a biological sample 
belong, according {o the process as defined above, 

in the case of observing a sample which does not generate a hybridization pattern 
compatible with those defined in Table 3, sequencing the portion of the HCV 
genome sequence corresponding to the aberrantly hybridizing probe of the new 
HCV' genotype to be determined. 
• The present invention also relates to the use of a composition as defined above, for 
detecting one* or more genotypes of HCV present in a biological sample liable to contain 
them, comprising the steps of: 

(i) possibly extracting sample nucleic acid, 

(ii) amplifying the nucleic acid with at least one of the primers as defined above, 

(iii) sequencing the amplified products 

r; (iv) ' inferring the HCV genotypes present from the determined sequences by comparison 
to all known HCV sequences. 

The present invention also relates to a composition consisting of or comprising at least one 
peptide or polypeptide comprising a contiguous- sequence of at least 5 amino acids 
corresponding to a contiguous amino acid sequence encoded by at least one of the HCV 
genomic sequences as defined above, having at least <?ne amino acid differing from the 
corresponding region of known HCV (type 1 and/or type 2 and/or type 3) polyprotein 
sequences as shown in Table 3, or muteins thereof. 

It is to be noted that, at the level of the amino acid sequence, an amino acid difference 
(with respect to known HCV amino acid sequences) is necessary, which means that the 
polypeptides of the invention correspond to polynucleic acids having a nucleotide difference 
(with known HCV polynucleic acid sequences) involving an amino acid difference. 

The new amino acid sequences, as deduced from the disclosed nucleotide sequences (see 
SEQ ID NO 1 to 62 and 106 to 123 and 143 to 218, 223 and 270), show homologies of only 
59.9 to 78% with prototype sequences of type 1 and 2 for the NS4 region, and of only 53.9 
to 68.8% with prototype sequences of type 1 and 2 for the El region. As the NS4 region is 
known to contain several epitopes, for example characterized in patent application EP-A-0 
489 968, and as the El protein is expected to be subject to immune attack as part of the viral 
envelope and expected to contain epitopes, the NS4 and El epitopes of the new type 3, 4 and 
5 isolates will consistently differ from the epitopes present in type 1 and 2 isolates. This is 
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examplified by the type-specificity of NS4 synthetic peptides as presented in example 4, and 
the type-specificity of recombinant El proteins in example 11. 1 

After aligning the new subtype 2d, type 3, 4 and 5 (see SEQ ID NO 1 to 62 and 106 to 
123 and 143 to 218; 223 and 270) amino acid sequence? with theprototype sequences of type 
la, ( lb, 2a, and 2b, type : and subtype-specific variable regions can be delineated as presented 
in Figure 5 and 7. 

■ ' ■ i i . 

As to the muteins derived from the polypeptides of the invention, Table 4 gives an 
overview of the amino acid substitutions which could be the basis of some of the muteins as 
defined above. 

The peptides according tp the present invention contain preferably at least 5 contiguous 
HCV amino acids, preferably however at least 8 contiguous amino acids, at least 10 or at 
least 15 (for instance at least 9, 11, 12, 13, 14, 20 or 25 amino acids) of the new HCV 
sequences of the invention. ' , 
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, TABLE 4 



i Amino acids Synonymous groups 



Ser (S) , Ser, Thr,' Gly, Asn 

Arg (R) •■ , • , Arg, Ifis, Lys, Glu, Gin ' 

Leu.(L) Leu; lie, Met, Phe, Val, Tyr 

Pro (P) i • Pro, Ala, Thr, Gly 

Thr .(T) . Thr, Pro, Ser, Ala, Gly, His, Gin 

Ala (A) . . ' Ala, Pro, Gly, Thr' 

Val (V) ' Val, Met, He, Tyr, Phe, Leu, Val 

Gly (G) Gly, Ala,. Thr, Pro, Ser 

lie (I) i He, Met; Leu, Phe, Val, He, Tyr 

Phe (F) ' Phe, Met, Tyr, He, Leu, Trp, Val 

Tyr (Y) ' Tyr, Phe, Trp, Met, He, Val, Leu 

Cys (C) , ■ Cys, Ser 4 Thr, Met 

His (H) ' , His, Gin, Arg, Lys, Glu, Thr 

Gin (0) ' Gin, Glu, His, Lys, Asn, Thr, Arg 

Asn (J^) Asn, Asp, Ser, Gin 

Lys (K) Lys, Arg, Glu, Gin, His 

Asp (Dj Asp, Asn, Glu,, Gin 

Gh} (E) , Glu, Gin, Asp, Lys, Asn, His, Arg 

• Met'(M) Met, lie, Leu, Phe, Val 



The polypeptides of the invention, and particularly the fragments, can be prepared by 
classical chemical synthesis. 

The synthesis can be carried out in homogeneous solution or in solid phase. 

For instance, the synthesis technique in homogeneous solution which can be used is the one 
described by Houbenweyl in the book entitled "Methode der organischen chemie" (Method 
of organic chemistry) edited by E. Wunsh, vpl. 15-1 et II. THIEME, Stuttgart 1974. 

The polypeptides of the invention can also be prepared in solid phase according to the 
methods described by Atherton and Shepard in their book entitled "Solid phase peptide 
synthesis" (IRL Press, Oxford, 1989). 

The polypeptides according to this invention can be prepared by means of recombinant 
DNA techniques as described by Maniatis et al., Molecular Cloning: A Laboratory Manual, 
New York, Cold Spring Harbor Laboratory, 1982). 

The present invention relates particularly to a polypeptide or peptide composition as 
defined above, wherein said contiguous sequence contains in its sequence at least one of the 
following amino acid residues: 
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L7, Q43, M44, S60, R67, Q70, T71, A79, A87, N106, K115, A127, A190, S130 Vl34 
G142, 1144, E152, A157, V158, P105, S177 or Y177, 1178, V180 or E180 or F182 R18 4 ' 
1186, H187, T189, A190, S191 or G191, Q192 or L192 or 1192 or V192 or E192 N193 or 
H193 or P193, W194 or Y194, H195, A197 or 1197 or V197 or T197, V202, 1203 or L203 
Q208, A210, V212, F214, 1216, R217 or D217 or E217 or V217, H218 or N218, H219 or 
V219 or L219, L227 or 1227, M231 or E231 or Q231, T232 or D232 or A232 or K232 
Q235 or 1235, A237 or T237,. 1242, 1246; S247, S248, V249, S250 or Y250, 1251 or V251 
or M251 or F251, D252, T254 or V254, L255 or V255, E256 or A256, M258 or F258 or 
V258, A260 or Q260 or S260, A261, T264 or Y264, M265, 1266 or A266, A267, G268 or 
T268, F271 or M271 or V271, 1277, M280 or H280, 1284 or A284 or L84 V274 V291 
N292 or S292, R293 or 1293 or Y293, Q294 or R294, L297 or 1297 or Q297, A299 or K29* 
or Q299, N3 03 or T303, T308 or L308, T310 or F310 or A310 or D310 or V310 L313 
G317 or Q317, L333, S351, A358, A359, A363, S364, A366, T369, L373, F376'q38 6 ' 
1387, S392, 1399, F402, 1403, R405, D454, A461, A463, T464, K484, Q500, E50l S52l' 
K522, H524, N528, S531, S532, V534, F536, F537, M539, 1546, C1282, A1283, H131o' 
V1312,, 91321, P1368, V1372, V1373, K1405, Q1406, S1409, A1424,' A1429 C143 5 ' 
S1436, SU56, H1496, A1504, D1510, D1529, 11543, N1567, D1556, N1567 M1572' 
Q1579, L1581, S1583, F1585, V1595, E1606 or T1606, M1611, V1612 or L1612 P1630 
C1636, P1651, T1656 or 11656, L1663, V1667, V1677, A1681, H1685, E1687, G^' 
V1695, A1700, Q1704, Y1705, A1713, A1714 or S1714, M1718, D1719, A1721 or T172l' 
R1722, A1723 or V1723, H1726 or G1726, E1730, V1732, F1735, 11736, SI 737 R173 8 ' 
T1739, G1740, Q1741, K1742, Q1743, A1744, T1745, L1746, El 747 or K1747 11749 
A1750, T1751 or A1751, V1753, N1755, K1756, A1757, P1758, A1759, H1762 T1763* 
Y1764, P2645, A2647, K2650, K2653 or L2653, S2664, N2673, F2680, K2681, L2686' 
H2692, Q2695 or L2695 or 12695, V2712, F2715, V2719 or Q2719, TC722, T2724 S2725* 
R2726, G2729, Y2735, H2739, 12748, G2746 or 12746, 12748, P2752 or K2752 P2754 or 
T2754, T2757 or P2757, 

with said notation being composed of a letter representing the amino acid residue by its one- 
letter code, and a number representing the amino acid numbering according to Kato et al., 
1990 as shown in Table 1 (comparison with other isolates). See also the numbering in Figures 
2, 5, 7, and 11 (alignment amino acid sequences). 

Within the group of unique and new amino acid residues of the present invention, the 
following residues were found to be specific for the following types of HCV according to the 
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HCV classification system used in the present invention: 

Q208, R217, E231, 1235, 1246, T264, 1266, A267, F271, k299, L2686, Q2719 
, which are specific for the HCV subtype 2d sequences of the present invention as 
shown in fig. 5 and 2; t , . . 

Q43, S60, R67, F182, 1186, H187, A190, S191, L192, W194, V202, L203, V219, 
. Q231, D232, A237, T2$4, M280, Q299, T303, L308, and/or L313 which are 
specific for the Core/El region of HCV type 3 of the invention as shown t in Fig. 

5; \ ' ^, , ' • ' 

D1556, Q1579, L1581, S1584, F1585,, E1606, V1612, P1630, C1636, T1656, 

L1663, Hl685, EJ687, 61689, V1695, Y1705, A1713, A1714, A172U V1723, 

H1726, R1738, Q1743, A1744, 'E1747, 11749, A1751, A1759 and/or H1762 which 

are specific for, the NS3/4 region of HCV type 3 sequences of the invention as 

shown 1 in Fig. 7; V 

K2665 k D2666, R2670 which are specific for the NS5B region of HCV type 3 of 

the invention as shown in Fig. 2; , ' ' 

- ' L7,, A79, A127, S130, E152, V158, S177 or Y177, V180 or E180, R184, T189, 

Q192 or E192 or 1192, N193 or H193, 1197 or V197, 1203, A210, V212, £217, 
H218, H219, L227, A232, V249, 1251 or M251, D252, L255 or V255, E256, 
M258 or V258 or F258, A260 or Q260, M265, T268, V271, V274, M280, 1284, 
N292 or ^292, Q294, L297 or 1297, T308, A310 or D310 or V310 or T310, and 
G317 which are specific for the core/El region of HCV type 4 sequences of the 
present invention as shown in Fig. 5; 

P2645, K2650, K2653, G2656, V2658, T2668, N2673 or N2673, K2681 , H2686, 
D2691, L2692, Q2695 or L2695 or 12695, Y2704, V2712, F2715, V2719, 12722, 
S2725. G2729, Y2735, G2746 or 12746, P2752 or K2752, Q2753, P2754 or 
T2754, T2757 or P2757 which are specific for the NS5B region of the HCV type 
4 sequences of the present invention as shown in Fig. 2; 

- M44, Q70, A87, N106, K115, V137, G142, P165, 1178, F251, A299, N303, Q317 
which are specific for the Core/El region of the HCV type 4 sequences of the 
present invention as shown in Fig. 5; 

L333, S351, A358, A359, A363, S364, A366, T369, L373, F376, Q386, 1387, 
S392, 1399, F102, I4G3, R405, D454, A461, A463, T464, K484, Q500, E501, 
S521, K522, H524, N528, S532, V534, F537, M539, 1546 which are specific for 
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^ e _^ ' re 8i°D of the HCV type 5 sequences of the present, invention a$ shown 
in Fig. 12; 

- C1282, A1283, Y ;312, Q1321, PJ368, V1372, K1405, Q1406, S1409 A1424 
A.429, C1435, S1436, S.456, HU96, A1 504, D1510,, D.529, 1,543,' NI 56 7 ' 
•I , " 72> V1?95 ' Tl606 - M,< ». M«". "«». V1667, A1681, A1700, Ann" 

- -S17M. mns, ttmKvm. R1722. A1723. 01726, Fms, um..sim 

■ I T.739, G.740, K1 742, T.745, L.746, K1747, AI750, VI753, N,75 5 , AI757' 

■ D!758, T1763, and V.764 which are specific for the NS3/NS4 region of HCV 

; t W >e5se< ! ,J f K «ioftheinve)itionasshowninFig 7- 

" . A2647, U653, S2674, F2680, T2724, R2726/Y2730, H2739 which are specific 
for, the NSJB region of the HCV rype 5 sequences of the present invention as 
shown in Fig. 2; 

- A25n, Fl«31. V!677, Q.704, E.730, V.732; Qra, and T175! which are specific 
■ for the HCV type 3 and 5 sequences of the present invention as shown it Pig 5 
and 7; , , 

\ ' ' V 1 ' A157 ' 1221 ■ »». ™>. W». V251, S260, M271, T2673, T2722 12748 
which are specific for the HCV type 3 and 4 sequences of the present invention as 
shown in Fig. 5 and 2, 

- V192, Y194, A197, P249, S250, R294 which are specific for the HCV type 4 and 
5 sequences of the present invention as shown in Fig. 5; 

- 1293 which is specific for the HCV type 4 and subtype 2d sequence of the present 
invention as shown in Fig. 5; 

D217 and R294 which are specific for the HCV type 3, 4 and 5 sequences of the 
present invention as shown in Fig. 5; 

- L192 which is specific for the HCV type 3 and subtype 2d sequences of the present 
invention as shown in Fig. 5; 

- Gl 91 and T197 which are specific for the HCV type 3, 4 and subtype 2d sequences 
of the present invention as shown in Fig. 5; 

- K232 which is specific for the HCV subtype 2d en type 5 sequences of the present 
invention as shown in Fig. 5. 

and with said notation being composed of a letter, unambiguously representing the amino acid 
by tnt one-letter code, and a nnmber representing the amino acid numbering according to Kan, 
«.!.. 1990 (see also Table 1 for comparison with other isolates), as well as Figure 2 (NS5 
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region), Figure 5 (Core/El region), Figure 7 (NS3/NS4 region), Figure 12 (E1/E2 region). 
Some of the above-mentioned amino acids may be contained in type or subtype specific 
epitopes. • , 

For example M23 1 (detected in type 5) refers to a methionine at position 23 1 . A glutamine 
j • (Q) is present at the same position 231 in type 3 isolates, whereas this position is occupied 
■ by an arginine in type 1 isolates and by a lysine (K) or asparagine (N) in type 2 isolates (see 
'• i Figure 5). ' 

' ' The peptide or polypeptide according to this embodiment of the invention may be possibly 
i labelled, or attached to a solid substrate, or coupled to a carrier molecule such as biotin, or 
mixed with a proper adjuvant. 

The variable region in the core protein (V-Ct)RE in Fig. 5) has b6en shown to be useful 
for serotyjiing (Machida et al., 1992). The sequence of the disclosed type 5 sequence in this 
region shows type-specific features. The peptide from amino acid 70 to 78 shows the 
following unique sequence for the sequences of the present inevntion (see figure 5): 
QPTGRSWGQ (SEQ ID NO '93) 
\ RSEGRTSWAQ (SEQ ID NO 220) 

and RTEGRTSWAQ (SEQ ID NO 221) 
Another preferred V-Core spanning region is the peptide spanning positions 60 to 78 of 
subtype 3c with sequence: , 
SRRQPIPRARRTEGRSWAQ (SEQ ID NO 268) 

Five type-specific variable regions (VI to V5) can be identified after aligning El amino 
acid sequences of the 4 genotypes, as shown in Figure 5. 

Region VI encompasses amino acids 192 to 203, this is the amino-terminal 10 amino acids 
of the El protein. The following unique sequences as shown in Fig. 5 can be deduced: 

LEWRNTSGLYVL (SEQ ID NO 83) 

VNYRNASGIYHI (SEQ ID NO 126) 

QHYRNISGIYHV (SEQ ID NO 127) 

EHYRNASGIYHI (SEQ ID NO 128) 

IHYRNASGIYHI (SEQ ID NO 224) 

VPYRNASGIYHV (SEQ ID NO 84) 

VNYRNASGIYHI (SEQ ID NO 225) 

VNYRNASGVYHI (SEQ ID NO 226) 

VNYHNTSGIYHL (SEQ ID NO 227) 
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QHYRNASGIYHV (SEQ ID NO 228) • 
QHYRNVSGIYHV "(SEQ ID NO 229) 
, IHYRNASDGYVT (SEQ ID NO 230) , 
LQVKNTSSSYMV (SEQ ID NO 231) 

JiTZ \7 m *° sta r"" 0 acids 213 2b - * - ^ 

round in, the V2 region as shown in Figure 5: 

VYEADDVILHT (SEQ ID NO 85) ' 

VYETEHHILHL (SEQ ID NO 129) 
, VYEADHHIMHL (SEQ ID NO 130) 

VYETDHHILHL (SEQ ip NO 131) 
VYEADNLILHA (SEQ ID NO 86) ' 
VWQLRAIVLHV (SEQ ID NO 232) 
VYEADYHILHL (SEQ ID NO 233) 
VYETDNHILHL (SEQ ID NO 234) 

VYETENHILHL (SEQ ID NO 235) 
VFETVHHILHL (SEQ ID NO, 236) 

VFET EHHILHL (SEQ ID NO 237) 

VFETDHHIMHL (SEQ ID NO 238) 

VYETENHILHL (SEQ ID NO 239) 

VYEADALILHA (SEQ ID NO 240) 

Region V3 encompasses the, amino acids 230 to 242. The following unique V3 region 
sequences can be deduced from Figure 5: 8 
VQDGNTSTCWTPV (SEQ ID NO 87) 

VQDGNTSACWTPV (SEQ ID NO 241) 

VRVGNQSRCWVAL (SEQ ID NO 132) 

VRTGNTSRCWVPL (SEQ ID NO 133) 
VRAGNVSRCWTPV (SEQ ID NO 134) 
EEKGNISRCWIPV (SEQ ID NO 242) 
VKTGNQSRCWVAL (SEQ ID NO 243) 
VRTGNQSRCWVAL (SEQ ID NO 244) 
VKTGNQSRCWIAL (SEQ ID N0 245) 

VKTGNVSRCWIPL (SEQ ID NO 247) 
VKTGNVSRCWISL (SEQ ID NO 248) 
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VRKDNVSRCWVQI (SEQ ID NO 249) • ' 

Region V4 encompasses the amino acids 248, to 257. The following unique V4 region 
sequences! can be deduced from figure 5: 

VRYVGATTAS (SEQ ID NO 89) , . 

APYIGAPLES (SEQ ID NO 135) 

APYVGAPLES (SEQ ID NO 136) , 

AVSMDAPLES (SEQ ID NO 137) 

APSLGAVTAP (SEQ ID NO 90) 

APSFGAVTAP (SEQ ID NO 250) . , 

VSQPGALTKG (SEQ ID NO 251) 

VKYVGATTAS (SEQ ID NO 252) ' 

APYIGAPVES (SEQ ID NO 253) 

AQHLNAPLES (SEQ ID NO 254) 

SPYVGAPLEP (SEQ ID NO 255) ', 
- SPYAGAPLEP (SEQ ID NO 256) , ' 

APYLGAPLEP (SEQ ID NO 257) 

ill, * 

APYLGAPLES (SEQ ID NO 258) 
APYVGAPLES (SEQ ID NO 259) 
VPYLGAPLTS (SEQ ID NO 260) , 

1 i 

APHLRAPLSS (SEQ ID NO 261) 

APYLGAPLTS (SEQ ID NO ,262) 
Region V5 encompasses the, amino acids 294 to 303. The following unique V5 region 
peptides can be deduced from figure 5: 

RPRRHQTVQT (SEQ ID NO 91) 

QPRRHWTTQD (SEQ ID NO 138) 

RPRRHWTTQD (SEQ ID NO 139) 

RPRQHATVQN (SEQ ID NO 92) ; 

RPRQHATVQD (SEQ ID NO 263) 

SPQHHKFVQD (SEQ ID NO 264) 

RPRRLWTTQE (SEQ ID NO 265) 

PPRIHETTQD (SEQ ID NO 266) 

The variable region in the E2 region (HVR-2) of type 5a as shown in Figure 12 spanning 
amino acid positions 471 to 484 is also a preferred peptide according to the present invention 
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with the following sequence: ' • 

TISYANGSGPSDDK (SEQ ID NO 267) 

*£r 8iven orpep,ides - « -r for — - — 

I at " ^lypepUdecon^ 

, aUeaa, 5 con.,g„oo S annno acids derived fr0ffl ^ ^ fa 

. According fa a specific embodbnen., «hc presen, invent relays «, a composition a, 

' ^c^? reinsaidw ^^^ 

amino acid type 3 sequences: ■ '. 

- a sequence having a homology of more than T>°r u. 

. L , ey "' more tnan 72 %, preferably more than 74% more' 

2«™> -re ft*, 77% ^ mos, preferably more ftan 80 or 84% bomomgy 

• a sequence having' a bomolog, o, more ftan 70%, preferably more ftan 72% more 
« more .ban *% homology, me preferable ftan „» homo J'J^ 

sSLnZX ^^- toM ^.«i -319- 

- a sequence having a homology of more than 8fi^ ui 

* man 86%, preferably more than 88% and most 

- a seance baving a homdogy of more .ban 76%, pre ferab,y more .ban 78% mos, 

NO 30, 32, 34, 36, 38 or 40 (HGC153, HDI0. BR36 sequences) in fte region spanning 
poanrona ,646 ,o .764 in «be NS3/NS4 region aa abown in Figure 7 and U 

- a sequence baving a homo,ogy of more .ban 81%, preferably more man 83%, and m0S . 

ZtZZTJZ homol ° 8y to m of * — •» s ~ * — 

» SEQ ,D NO .4. .6, .8, 20, 22, 24, 26 or 28 (HD.O, BR36, BR33 sequences) in ft. 
reg.on apannmg poshiona !40 «o 319 in fte Core/El region aa abown in Figme 5- 

- a sequence baving a hommogy of more ftan 81.5%, preferab!y more ftan 83%, and mos, 
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I 

preferably more than 86% homology to any of the amino acid sequences as represented 
in SEQ ID NO 14, 16, 18, 20, 22, 24, 26 or 28 (HD10, BR36, BR33 sequences) in the 
El region spanning positipns 192 to 319 as shown in Figure 5; 

- a sequence having a homology of more than 86%, preferably more than 88%, most 
^ . preferably more th^n 90%' to the amino acid sequence as represented in SEQ ID NO 150; 
» ■ (type 3c BE98) in the region spanning positions 2645 to 2757 in the NS5B region as shown 

in Figure 2: 

• According to yet another embodiment, the present invention relates to a composition as 
i defined above, wherein said contiguous sequence is selected from any of the following HCV 
amino acid type 4 sequences: 

- a sequence having a homology of more than 80%, preferably more than 82% , most 

• i * 

preferably more than 84% homology to any of the amino acid .sequences as represented 
in SEQ ID NO 118, 120, and 122 (GB358, GB549, GB809 sequences) in the region 
spaniiing positions 127 to 319 of the Core/El region as. shown in Figure 5; 

- a sequence having a homology of more than 73%, preferably more than 75%, most 
\ preferably more than 78% homology in the El region spanning positions 192 to 319 to any 

of the amino acid sequences as represented in SEQ ID NO 118, 120, and 122 (GB358, 
GB549, GB809 sequences) in the region spanning positions 140 to 319 of the Core/El 
region as shown in Figure 5; 

- a sequence having more than 85%, preferably more than 86%, most preferably more than 
87% homology to any of the amino acid sequences as represented in SEQ ID NO 118, 120 
or 122 (GB358, GB549, GB809 sequences) in the region spanning positions 192 to 319 of 
El as shown in Figure 5; 

- a sequence showing more than 73 % , preferably more than 74 % , most preferably more than 
75% homology to any of the amino acid sequences as represented in SEQ ID NO 106, 
108, 110, 112, 114 or 116 (GB48, GB116, GB215, GB358, GB549, GB809 sequences) 
in the region spanning positions 2645 to 2757 of the NS5B region as shown in Figure 2; 

- a sequence having any of the sequences as represented in SEQ ID NO 164 or 166 (GB809 
and CAM600 sequences) in the Core/El region as shown in Figure 5; 

- a sequence having any of the sequences as represented in SEQ ID NO 168, 170, 172, 174, 
176, 178, 180, 182, 184, 186, 188 or 190 (CAM600, GB809, CAMG22, CAMG27, 
GB549, GB438, CAR4/1205, CAR4/901, GB116, GB215, GB958, GB809-4 sequences) 
in the El region as shown in Figure 5; 
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- .sequence havhjg any of the sequences as represented in SEQ ID' NO 192 194 196 198 
200 ,02, 204, 206,20, 210, 212 (GB358, OB724, BE 100, PC, , 
etc.) in the NS5B region. 

The above-mentioned type 4 peptides po.vpep.ide*' comprise a. leas, an amino acid 

ZH. TT ^ 4 ^'"^ sequence as 

disclosed by Simmonds et al. (1993, EG-29, see Figure 5) 

According to ye» another aspect, the present invention re,a<es to a composition as defined ' 
above^wherem said contiguods seance isse.ec.ed from any of me following HCV amino 
acid type 5 sequences: 

- a ^uence having more than 93%, preferab. y more than 94%, most preferab.y more than 
95% homoiogy in the region spanning Core positions 1 to 19! to any of the amino acid 

SEQ ID NO, 152 (BE95) as shown in Figure 5; 

a sequence having more man 73%, preferably more' than 74%, most preferably 
more man 76% homology in the region spanning El positions 192 to 319 to any 
.. of the ammo acid sequences as represented in SEQ ID NO 42, 44, 46, 48, 50, 52 
or 54 (PC sequences) as shown in Figure 5 - 

ZHT, ; ha T 8 ' D,0re ?8% ' Preferably ^ —Preferably more 

man 83% homology „ any of the amino acid sequences as represented in SEQ ID NO 42 
44, 46, 48, 50, 52, 54, 154, 156 (BE95 BFinnt n>n 

„ ... , ' • *• 130 lBt! ' 5 - BE 100)(PC sequences) in the region spanning 

postttons 1 to 319 of the Core/El region as shown in Figure 5- 

- a sequence having m 0re man 90%, preferably more than 91 %. most preferably more than. 
92% homology «, any of the amino acid sequences represented in SEQ ID NO 56 to 58 
(PC sequences) in the region spanning positions 1286 to 1403 of the NS3 region as shown 
in Figure 7 or 11; 

-. ^^-uigmore^ 

homology to any of the atnino acid sequences as represented in SEQ ID NO 60 or 62 (PC 
sequences) in the region spanning positions 1646 to 1764 of the NS3/4 region as shown 
in Figure 7 or 11. 

_ According to ye, another embodiment, dte presen, invention re.ates ,o a 
composidon as define, above, wherein said contiguous seq „ence is selected from any of 
die following HCV amino acid type 2d sequences.- 

a sequence having more dtan 83%, preferably more than 85%, most preferably more than 
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87% homology to the amino acid sequence as represented in SEQ ID NO 144 (NE92) in 
the region spanning positions 1 to 319 of the Core/El region as shown in Figure 5; 
- a sequence having more ttjan 79%, preferably more than 81 % , most preferably more than 
84% homology in the region spanning El positions 192 to 319 to, the amino acid sequence 

^ , as represented in SpQ ID NO 144, (NE92) as shown in Figur^ 12; 

» - • a sequence having more than 95 % , more particularly 96 % , most particularly 97 % or more 

homology to the amino acid sequence as represented in SEQ ID NO 146 (NE92) in the 

i 

> region spanning positions 2645 to 2757 of the NS5B region as shown in Figure 2. 

The present invention also relates to a recombinant vector, particularly for cloning and/or 
expression, with said recombinant vector comprising a! vector sequence, an appropriate 
prokaryotic, eukaryotic or viral promoter sequence followed by the nucleotide sequences as 
defused ab6ve, with said recombinant vector. allowing the expression of any one of the HCV 
type 2 and/or HCV type 3 and/or type 4 and/or type 5 derived polypeptides as defined above 
in a prokaryotic, or eukaryotic host or in living mammals when injected as naked DNA, and 
more particularly a recombinant Vector allowing the expression of any of the following HCV 
type 2d,' type 3, type 4 or type 5 polypeptides spanning the following amino acid positions: 
a polypeptide starting at position 1 and ending at iahy position in the region between 
positions 70 and 326, more particularly a polypeptide spanning positions 1 to 70, 
1 to 85, positions 1 to 120, positions 1 to 150, positions 1 to 191, positions 1 to 
200, for expression of the Core protein, and a polypeptide spanning positions 1 to 
263, positions 1 to 326, for expression of the Core and El protein; 
a polypeptide starting at any position in the region between positions 1 17 and 192, 
and ending at any position in the region between positions 263 and 326, for 
expression of El, or forms that have the putative membrane anchor deleted 
(positions 264 to 293 plus or minus 8 amino acids); 

a polypeptide starting at any position in the region between positions 1556 and 
1688, and ending at any position in the region between positions 1739 and 1764, 
for expression of the NS4 regions, more particularly a polypeptide starting at 
position 1658 and ending at position 1711 for expression of the NS4a antigen, and 
more particularly, a polypeptide starting at position 1712 and ending between 
positions 1743 and 1972, for example 1712-1743, 1712-1764, 1712-1782, 1712- 
1972, 1712 to 1782 and 1902 to 1972 for expression of the NS4b protein or parts 
thereof. 
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The term "vector " may comprise a plasmid, a cosmid, a phage, or a virus' 
In order to carry out the. expression of the polypeptides of the invention in bacteria such 
as E. coh or in eukaryotic cells such as in S. cerevtsiae, or in cultured vertebrate or 
invertebrate hosts such as insert cells, Chinese Hamster Ovary (CHO), COS, BHK, and 
MDCK cells, the following steps are carried out: 

- , transformation of an appropriate cellular host with a recombinant vectof 'in which 
a nucleotide sequence coding for one of the polypeptides of the invention has been ■ 
inserted under the control of the appropriate regulatory elements, particularly a 
promoter recognized by the polymerases of the cellular host and, in thecaseofa 
prokaryotic host, an appropriate ribosome binding site (RBS), enabling the 
expression in said cellular host of said nucleotide sequence. In the case of an 
eukaryotic host, any artificial signal sequence or pre/pro .sequence might be 
provided, or .the natural ■ HCV signal sequence might be employed, e.g. for 
expression of El the signal sequence starting between amino acid positions 1 17 and 
170 and ending at amino acid position 191 can be' used, for expression^ NS4 the 
■ signal sequence starting between amino acid positions 1646 and 1659 can be used 

- culture of said transformed cellular host-under conditions enabling the expression 
of said insert. 

The present invention also relates to a composition as defined above, wherein said 
polypeptide is a recombinant polypeptide expressed by means of an expression vector as 
defined above. 

The present invention also relates to a composition as defined above, for use in a method 
for immunizing a mammal, preferably humans, against HCV comprising administring a 
sufficient amount of the composition possibly accompanied by pharmaceutically acceptable 
adjuvants, to produce an immune response, more particularly a vaccine composition including 
HCV type 3 polypeptides derived from the Core, El or the NS4 region and/or HCV type 4 
and/or HCV type 5 polypeptides and/or HCV type 2d polypeptides. 

The present invention also relates to an antibody raised upon immunization with a 
composition as defined above by means of a process as defined above, with said antibody 
being reactive with any of the polypeptides as defined above, and with said antibody being 
preferably a monoclonal antibody. 

The monoclonal antibodies of the invention can be produced by any hybridoma liable 
to be formed according to classical methods from splenic cells of an animal, particularly from 
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a mouse or rat, immunized against the HCV» polypeptides according to the invention, or 
mute ins thereof, or fragments thereof as defined above on the one hand, and of cells of a 
myeloma cell line on the other hand, and to be selected by the ability of the hybridoma to 
produce the monoclonal antibodies recognizing tlje polypeptides Which has been initially used 
for ,the immunization, of, the animals. . ■ 1 , 

The antibodies involved in the invention can be labelled by an appropriate label of the 
enzymatic, fluorescent, or radioactive type. , 

The monoclonal antibodies according to this preferred embodiment of the invention may 
be humanized versions of mouse monoclonal antibodies made by means of recombinant DNA 
technology, departing from parts of mouse and/or human genomic DNA sequences coding 
for H and L chains or from cbNA clones coding for H and L chains. 

Alternatively the monoclonal antibodies according to this preferred' embodiment of the 

invention may be human monoclonal antibodies. These antibodies according to the present 

embodiment of the invention can also be, derived from human peripheral blood lymphocytes 

1 i 
of patients infected with type 3, type 4 or type 5 HCY, or vaccinated against HCV. Such 

human monoclonal antibodies are prepared, for instance, by means of human peripheral blood 

lymphocytes (PBL) repopulation of severe combined immune deficiency (SCID) mice (for 

recent review, see Duchosal et al. 1992). 

The invention also relates to the use of the proteins of the invention, muteins thereof, or 
peptides derived therefrom for the selection of recombinant antibodies by the process of 
repertoire cloning (Persson et al., 1991). 

Antibodies directed to peptides derived from a certaing genotype may be used either for 
the detection of such HCV genotypes, or as therapeutic agents. 

The present invention also relates to the use of a composition as defined above for 
incorporation into an immunoassay for detecting HCV, present in biological sample liable to 
contain it, comprising at least the following steps: 

(i) contacting the biological sample to be analyzed for the presence of HCV antibodies 
with any of the compositions as defined above preferably in an immobilized form 
under appropriate conditions which allow the formation of an immune complex, 
wherein said polypeptide can be a biotinylated polypeptide which is covalently 
bound to a solid substrate by means of streptavidin or avidin complexes, 

(ii) removing unbound components, 

(iii) incubating the immune complexes formed with heterologous antibodies, which 
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specifically bind to the antibodies. present , in the sample, to be analyzed, with said 
heterologous antibodies having conjugated to a detectable label under appropriate 
conditions, 

(iv) detecting the presence of said immunecomplexes visually or by means of 
^ densitometry and inferring the HCV serotype present from the observed 

hybridization pattern. 

The present , invention also relates to the use of a composition as defined above for 
incorporation into a serotyping assay for detecting one or more serological types of HCV 
present m a biological sample liable to .contain it, more particularly for detecting El and NS4 
antigens or antibodies of the different types to be detected combined in one assay format 
comprising at least the following steps: 

(i) , contacting the biological sample to.be analyzed for the presence of HCV antibodies 

or antigens of one or more serological types, with at least one of the compositions 
■as defined above, an immobilized form under appropriate conditions which allow 
the formation of an immunecomplex, 1 

(ii) 1 Removing unbound components ■ 

(hi) incubating the immunecomplexes formed with heterologous antibodies, which 
specifically bind to the antibodies present in the sample to be analyzed, with said 
heterologous antibodies having conjugated to a detectable label under appropriate 
conditions, 

(iy) detecting the presence of said immunecomplexes visually or by means of 
densitometry and inferring the presence of one or more HCV serological types 
present from the observed binding pattern. 
The present invention also relates to the use of a composition as defined above for 
nnmobUization on a solid substrate and incorporation into a reversed phase hybridization 
assay, preferably for immobilization as parallel lines onto a solid support such as a membrane 
strip, for determining the presence or the genotype of HCV according to a method as defined 
above. 

The present invention thus also relates to a kit for determining the presence of HCV 
genotypes as defined above present in a biological sample liable to contain them, comprising: 
possibly at least one primer composition containing any primer selected from those 
defined above or any other HCV type 3 and/or HCV type 4, and/or HCV type 5, 
or universal HCV primers, 
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at least one probe composition as defined above, with* said probes, being 
preferentially immobilized oh a solid substrate, and more preferentially on one and 
the same membrane strip, 

a buffer or components necessary for producing the buffer enabling hybridization 
^ - reaction between these probes and the possibly amplified products to be carried out, 

• ■ - means for detecting the hybrids resulting from the preceding hybriziatibn,. 

possibly also including an automated scanning and interpretation device for 
inferring the HCV genotypes present in the sample from the observed hybridization 
pattern. • 
The genotype may also be detected by means of a type-specific antibody as defined above, 
which is linkfed to any polynucleotide sequence that can afterwards be' amplified by PCR to 
detect the iinlnune complex formed (Immuno-PCR, Sano et al., 1992); 

The present invention also related to a kit for determihing the presence of HCV antibodies 
: as defined above present in a biological sample liable to contain them, comprising: 

at least one polypeptide composition as defined above, preferentially in combination 
\ ' with other polypeptides or peptides from HCV type 1 , HCV type 2 or other types 

of HCV, with said polypeptides being preferentially immobilized on a solid 
substrate, and more preferentially on one and the same membrane strip, 
- a buffer or components necessary for producing the buffer enabling binding 
reaction between these polypeptides and the antibodies against HCV present in the 
biological sample, 

means for detecting the immunecomplexes formed in the preceding binding 
reaction, 

possibly also including an automated scanning and interpretation device for 
inferring the HCV genotypes present in the sample from the observed binding 
pattern. 
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Figure Legends . 

" - ' .. ' ' ' ' 

Figure 1, , 

T' ° f C °* XnW mCle ° tide f ° r eaCh ° f "» 3a isolates BR34 ! BR36 

G^ r^r *" — With SEQ ' ,D N ° 5 ' * W 4 GB48„ OBI .6,' 

Z nri^f ^ B8 ° 9 " CAM ^ OB438, CAR4.205, CAR.,50 

(SEQ ,D NO. 106, 108, 1,0, U2, 114. 116, 201, 203, 205, 207, 209 *d 2„>,,yp e 5a 
isolates BE95 and BE96 (SEO ID NO 1SQ anrf ' 

■ 0 Md We 2,1 isolale N E92 (SEQ ID NO 

145) fro. the region between nucleotides 7932 and 8271, with kDowa sequences from 

corresponding region of isohttes HCV-,, HCV-J, HC-J6, HC-J8, T, and T9, and others „ 
shown in Table 3. 1 . 

it*- 

Fjgure_2 ' . , 

Aligntnen, of, amino acids se^ncps deduced from the nucleic acid seances as 
represented in Figure , from the subtype 3a done* BR34 (SEQ ID NO 2, 4), BR36 (SEQ ,D 
NO 6, 8) and BR33 (SEQ ID NO 10, 12), the subtype 3c clone BE98 (SEQ ID NO 150) and 
the W e 4 clones GB48 (SEQ ID NO 107), GB116 (SEQ ID NO 109), GB215 (SEO lb' NO 
111). GB358 (SEQ ID NO 113), GB549 (SEQ fD NO 1,5) GB809 (SEQ ,D NoT 
CAM6O0 CAMG22, CB438, CA R 4,,205, C AR , /5 0, (SEQ ID NO 202, Q 2 04, 206, £ 
. 210, 212); the type 5a clones BE95 and BE96 (SEQ ID NO ,60 and 162); as ™U as Ute 
subtype 2d isolate NE92 (SEO ID NO 14tf» fr^m 

. u ■ 1 ^ lU WU 146) from T Won between amino acids 2645 to 

2757 wath known sequences ,from the corresponding region of isolates HCV-I HCV-J HC 
J6, and HC-J8, Tl and T9, and other sequences as shown in Table 3. 

Figure 3 

pigment of type 2d, 3c, 4 and 5a nucleotide sequences from isolates NE92 BE98 
GB 58, GB809, CAM600, GB724, BE95 (SEQ ID NO 143, 147, 191, 163, 165, 193 and 
151) m the Core region between nucleotide positions 1 and 500, with known sequences from 
the corresponding region of type 1 , type 2, type 3 and type 4 sequences. 

Figure 4 

Alignment of nucleotide sequences for the subtype 2d isolate NE92 (SEQ ID NO 143) the 
type 4 isolates GB358 (SEQ ID -NO 118 and 187), GB549 (SEQ ID NO 120 and 175),'and 
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GB809-2 (SEQ ID NO 122 and 169), GB 809^4, BG116, GB215, CAM600, CAMG22, 
CAMG27, GB438, CAR4/1205, CAR4/901 (SEQ ID NO 189, 183, 185 > 167, 171, 173, 177, 
179; 181), sequences for each of the subtype 3a isolates HD10, BR36, and BR33, (SEQ ID 
NO 13, 15, 17 (HD10), 19, 21 (BR36) and,23, ,25 or 27 (BR23) and the subtype 5a isolates 
BE95 and BE100 (SEQ ID NO 143 and 195), from the region between nucleotides, 379 and 
957, with known sequences from the corresponding region of type 1 and 2 and 3. 

Figure 5 

, Alignment of amino acid sequences deduced from the new HCV nucleotide sequences of 
the Core/El region of isolates BR33, BR36, HD10, GB358, GB549, and GB809, PC or 
BE95, CAM600, and GB724 1 (SEQ ID NO. 14, 20, 24, 119 or 192, 121, 123 or 164, 54 or 
152, 166 and 194) from the region between positions 1 and 319, with known sequences from 
type la (HCV-1), type lb (HCV-J^ type 2a (HC-JG), type 2b (HC-Jfc), NZL1, HCV-TR, 
positions 7-89 of type 3a (E-bl), and positions 8-88 of type 4a (EG-29). V-Core, variable 
region with type-specific features in the core protein, VJ, variable region 1 of the' El protein, 
V2, variable region 2 of the El protein, V3, variable region 3 of the El protein, V4, variable 
region 4 of the El protein, V5, variable regioh 5 of the El protein. 

Fi gure 6 . 

Alignment of nucleotide sequences of isolates HCCL53, HD10 and BR36, deduced from 
clones with SEQ ID NO 29, 31, 33, 35, 37 and 39, from the NS3/4 region between 
nucleotides 4664 to 5292, with known sequences from the corresponding region of isolates 
HCV-1, HCV-J, HC-J6, and HC-J8, EB1, EB2, EB6 and EB7. 

Figure 7 

Alignment of amino acid sequences deduced from the new HCV nucleotide sequences of 
the NS3/NS4 region of isolate BR36 (SEQ ID NO 36, 38 and 40) and BE95 (SEQ ID NO 
270). NS4-1, indicates the region that was synthesized as synthetic peptide 1 of the NS4 
region, NS4-5, indicates the region that was synthesized as synthetic peptide 5 of the NS4 
region; NS4-7, indicates the region that was synthesized as synthetic peptide 7 of the NS4 
region. 

Figure 8 
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Reactivity of the three LIPA-selected (Stuyver et al., 1993) type 3 sera on the Inno-LIA 
HCV Ab II assay (Innogenetics) (left), and on the NS4-LIA test. For the NS4-LIA test, NS4- 
1, NS4-5, and NS4-7 peptides were synthesized based on the type 1 (HCV-1), type 2 (HC-J6) 
and type 3 (BR36) prototype isolate sequences as shown in Table 4, and applied as parallel 
^ lines onto a membrane strip as indicated. 1, serum BR33, 2, serum HD10, 3, serum DKH 

Figure 9 , . . ' 

, Nucleotide 'sequences of Core/El clones obtained from the PCR fragments PC-2, PC-3, 
and PC-4, obtained from serum BE95, (PC-2-1 (SEQ ID NO 41), PC-2-6 (SEQ ID NO 43)' . 
PC-4- 1 (SEQ ID NO 45), PC-4-6 (SEQ ID NO 47), PO-3-4 (SEQ ID NO 49), and PC-3-8 
(SEQ ID NQ 51)) of subtype 5a isolate BE95. ' 

A consensus sequence is shown for the Core and El region of isolate BE95, presented as 
PC C/El with SEQ ID NO 53. Y, C or T, R, A or G, S, C or G. 

Figure 10 , , 

Alignment of nucleotide sequences of clones with SEQ ID NO 197 and 199 (PC sequences 
see also SEQ ID NO 55, 57, 59) and SEQ ID NO 35, 37 and 39 (BR36 sequences) from the 
NS3/4 region between nucleotides 3856 to 5292, with known sequences from the 
corresponding region of isolates HCV-1, HCV-J, HC-J6, and HC-J8. 

Figure U 

Alignment of amino acid sequences of subtype 5a BE95 isolate PC clones with SEQ ID 
NO 56 and 58, from the NS3/4 region between amino acids 1286 to 1764, with known 
sequences from the corresponding region of isolates HCV-1, HCV-J, HC-J6, and HC-J8. 

Figure 12 

Aligment of amino acid sequences of subtype 5a isolate BE95 (SEQ ID NO 158) in the 
E1/E2 region spanning positions 328 to 546, with known sequnces from the corresponding 
region of isolates HCV-1, HCV-J, HC-J6, HC-J8, NZL1 and HCV-TR (see Table 3). 

Figure 13 

Alignment of the nucleotide sequences of subtype 5a isolate BE95 (SEQ ID NO 157) in 
the E1/E2 region with known HCV sequences as shown in Table 3. 
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EXAMPLES 

* * . v ' 

Example 1; The NS5b region of HCV type 3 1 

Type 3 sera, selected by meahs of the INNQ-L4PA HCV research kit (Stuyver et ah, 1993) 
frojn a number of Brazilian blood donors, ( were positive in the HCV antibody ELISA 
(Innotest HCV Ab II; Innogenetics) and/or in,the INNO-LIA HCV Ab II confirmation test 

(Innogenetics), Only those sera that were positive after the first round of PCR reactions 

i 1 
(Stuyver et al., 1993) were retained for further study. 

, Reverse transcription and nested PCR: RNA was extracted from 50 fil serum and subjected 

to cDNA synthesis' as described (Stuyver et al., 1993). This cDNA was used as template for 

PCR, for which the total volume was increased to 50 fi\ containing lOpmoles of each primer, 

3 pi of lOx Pfu buffer 2 (Stratagene) and 2.5 U of Pfu DNA polymerase (Stratagene). The 

cDNA was amplified over 45 cycles consisting of 1 min 94* C, 1 min 50* C and 2 min 72 *C. 

The amplified products were separated by electrophoresis, isolated, cloned and sequenced as 

described (Stu^rvet et al., 1993). 

Type 3a an0 3b-specific primers in the NS5 region were selected from the published 

sequences (Mori et al., 1992) as follows: 

for type 3a: 

HCPrl61(+): 5'-ACCGGAGGCCAGGAGAGTGATCTCCTCC-3* (SEQ ID NO 63) and 
HCPrl62(-): 5*-GGGCTGCTCTATCCTCATCGACGCCATC-3' (SEQ ID NO 64); 

for type 3b: , 
HCPr 1 63( + ) : 5 '-GCCAGAGGCTCGGAAGGCGATCAGCGCT-3 * (SEQ ID O 65) and 
HCPrl64(-): 5 ' -G AGCTGCTCTGTCCTCCTCGACGCCGC A-3 * (SEQ ID NO 66) 
Using the Line Probe Assay (LiPA) (Stuyver et al., 1993), seven high-titer type 3 sera 
were selected and subsequently analyzed with the primer sets HCPr 161/1 62 for type 3a, and 
HCPrl63/164 for type 3b. None of these sera was positive with the type 3b primers. NS5 
PCR fragments obtained using the typfc 3a primers from serum BR36 (BR36-23), serum BR33 
(BR33-2) and serum BR34 (BR34-4) were selected for cloning. The following sequences were 
obtained from the PCR fragments : 
From fragment BR34-4: 
BR34-4-20 (SEQ ID NO 1), BR34-4-19 (SEQ ID NO 3) 

From fragment BR36-23: 
BR36-23-18 (SEQ ID NO 5), BR36-23-20 (SEQ ID NO 7) 
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From fragment BR33-2: . 1 

BR33-2-17 (SEQ ID NO 9), BR33-2-21 (SEQ ID NO 11) 

An alignment of sequences with SEQ ID NO 1, 5 and 9 with known sequences is given 
in Figure 1 . An alignment of the deduced amino acid sequences is, shown in Figure 2. The 
( , 3 isolates are very closely related to each other (mutual homologies of about 95 %) and to the 
published sequences of type 3a (Mori et al., 1992), but are only distantly related to type 1 
■'. , and type 2 sequences (Table 5). Therefore, it is clearly demonstrated that NS5 sequences 
from LiPA-selected type 3 sera are indeed derived from a type 3 genome. Moreover, by 
• analyzing the NS5 region of serum BR34, for which no 5'UR sequences were determined as 
described in Stuyver et al. (1993); the excellent correlation between typing by means of the 
LiPA and genotyping as deduced from nucleotide sequencing was further proven. 

i' i ( 

Example 2; The Care/F.% r egion of HPV t Y r"> * • 

After, aligning the sequences of HCV-1 (Choo et al., 1991), HCV-J (Kato et al'., 1990), 
HC-J6 (Okamoto et al. , 1991), and HC-J8 (Okamoto et al. , 1992), PCR primers were chosen 
( in those regions of little sequence variation. Primers HCPr23(+)- 5'- 
CTCATGGGGTACATTCCGCT-3 ' (SEQ ID NO 67) and HCPr54(-): 5'- 
TATTACCAGTTCATGATCATATCCCA-3 ' (SEQ ID NO 68), were synthesized on a 392 
DNA/RNA synthesizer (Applied Biosystems). This set of primers was selected to amplify 
the sequence from nucleotide 397 to 957 encoding amino acids 140 to 319 (Kato etal., 1990): 
52 amino acids from the carboxyterminus of core and 128 amino acids of El (Kato et al., 
1990). The amplification products BR36-9, BRR33-1, and HD10-2 were cloned as described 
(Stuyver et al., 1993). The following clones were obtained from the PCR fragments: 
From fragment HD10-2: 
HD10-2-5 (SEQ ID NO 13), HD10-2-14 (SEQ ID NO 15), HD10-2-21 (SEQ ID NO 17) 

From fragment BR36-9: 
BR36-9-13 (SEQ ID NO 19), BR36-9-20 (SEQ ID NO 21), 

From fragment BR33-1: 
BR33-1-10 (SEQ ID NO 23), BR33-1-19 (SEQ ID NO 25), BR33-1-20 (SEQ ID NO 27), 
An alignment ( of the type 3 El nucleotide sequences (HD10, BR36, BR33) with SEQ ID 
NO 13, 19 and 23 with known El sequences is presented in Figure 4. Four variations were 
detected in the El clones from serum HD10 and BR36, while only 2 were found in BR33. 
All are silent third letter variations, with the exception of mutations at position 40 (L to P) 
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and 125 (M to I). The homologies of the type 3 El region (withdut core) with type-1 and 2 
prototype sequences are depicted in Table 5. 

In total, 8 clones covering the core/El region of 3 different isolates were sequenced and 
the El portion was compared with the known genotypes (Table 3) as shown in Figure 5. 

. After computer analysis of thle deduced amino acid sequence, a signal-anchor sequence at the 

■I i 1 1 ■ . 

'■ core carboxyterminus was detected which might, through analogy with type lb (Hijikata et 

aL, 1991), promote cleavage before the LEWRN sequence (position 192, Fig. 5). The L-to-P 

mutation in one of the HD10-2 clones resides in this signal-anchor region and potentially 

i impairs recognition hy signal peptidase (computer prediction): Since no examples of such 
substitutions were found at this position in previously described sequences, this mutation 
might have resulted from reverse transcriptase or Pfu polymerase misincorporation. The 4 
amino- terminal potential N-linked glycosylation sites, which are also present in HCV types 
la and 2, remain conserved in type 3. The N-glycosylktion site in type lb (aa 250, Kato et 
al.; 1990) remains a unique feature of this subtype. All El cysteines, and the putative 
transmembrane region (aa 264 to 293, computer prediction) containing the aspartic acid at 

\ position 279, are conserved in all three HCV types. The following hypervariable regions can 
be delineated: VI from aa 192 to 203 (numbering according to Kato et aL, 1990), V2 (213- 
223), V3 (230-242), V4 (248-257), and V5 (294-303). Such hydrophilic regions are thought 
to be exposed to the host defense mechanisms. This variability might therefore have been 
induced by the host's immune response. Additional putative N-linked glycosylation sites in 
the V4 region in all type lb isolates known today and in the V5 region of HC-J8 (type 2b) 
possibly further contribute to modulation of the immune response. Therefore, analysis of this 
region, in the present invention, for type 3 and 4 sequences has been instrumental in the 
delineation of epitopes that reside in the V-regions of El, which will be critical for future 
vaccine and diagnostics development. 

Example 3: The NS3/NS4 region of HCV Type 3 

For the NS3/NS4 border region, the folllowing sets of primers were selected in the regions 
of little sequence variability after aligning the sequences of HCV-1 (Choo et al. , 1991), HCV- 
J (Kato et al., 1990), HC-J6 (Okamoto et al., 1991), and HC-J8 (Okamoto et al., 1992) 
(smaller case lettering is used for nucleotides added for cloning purposes): 
set A: 

HCPrll6(-h): 5 ' -ttttA AATAC ATCATGRCITG YATG-3 ' (SEQ ID NO 69) 
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HCPr66 (-): 5'-ctattaTTGTATCCCRCTGATGAARTrCCACAf-3' (SEQ ID NO 70) 

' set B: . " . 

HCPrl 16(+): 5 '-ttttAA ATACATCATGRCiTGYATG-3 ' (SEQ ID NO 69) 

HCPrl 1 8(-): S'-atctagtcgactaYTGIATICCRCTflATRWARTTGCACAT-S' (SEQ ID N071) 
setC: 

HCPrl 17(+): 5 '-ttttAAATACATCGCIRCITGCATGCA-3 ' (SEQ ID NO 72) 
HCPr66 (-): 5 '-ctattaTTGTATCCCRCTGATGAARTTCCACAT-3 ' (SEQ ID Np 70) 
setD: .'; 1 ' 

, HCPrl 17(+): 5'-ttttAAATACATCGCIRCITGCATGCA-3' (SEQ ID NO 72) 

HCPrl 18(-): 5 ,J actagtcg^ctaYTGlATICCRCTIATRWARTTCCACAT-3 ' (SEQ ID N071) 

set E: ' ■ , ' ' 

HCPrll6(+): 5'-ttttAAATACATCATGRCI7GYATG-3' (SEQ ID NO 69) 

HCPrl 19(-):actagtcgactaRTTIGCIATIAGCCG/TRTTCATCCAYTG-3' (SEQ ID NO 73) 
setF:, 1 ' 

HCPrl 17(+): 5 '-ttttAAATACATCGCIRCITGCATGCA-3 ' (SEQ ID NO 72) 

HCPrl 19( r ): actagtcgactaRTTIGCIATIAGCCG/TRTTCATCCAYTG-3' (SEQ ID NO 73) 

set G: ., 

HCPrl31(+): 5 ' -ggaattctagaCCITCITGGGA YG ARAYITGGAARTG-3 * (SEQ ID NO 74) 
HCPr66 (-): 5 ' -ctattaTTGTATGCCRCTGATG A ARTTCCACAT-3 ' (SEQ ID NO 70) 
set H: 

HCPrl30(+): 5 '-ggaattctag ACIGCITA YCARGCIACIGTITG YGC-3 ' (SEQ ID NO 75) 
HCPr66 (-): 5 ' -ctattaTTGTATCCCRCTGATGAARTTCCACAT-3 ' (SEQ ID NO 70) 
set I: 

HCPrl34(+): 5 '-CATATAGATGCCCACTTCCTATC-3 ' (SEQ ID NO 76) 
HCPr66 (-) : . 3 * -ctattaTTGTATCCCRCTGATGAARTTCC AC AT-3 ' (SEQ ID NO 70) 
set J: 

HCPrl31(+): 5 '-ggaattctagaCCITCITGGGA YGARAYITGGAARTG-3 ' (SEQ ID NO 74) 
HCPrl 18(-): 5 ' -actagtcgactaYTGIATICCRCTIATRWARTTCCACAT-3 ' (SEQ ID NO 
71) 

set K: 

HCPrl30(+): 5 '-ggaattctagACIGCITA YCARGCIACIGTITGYGC-3 ' (SEQ ID NO 75) 

HCPrl 18(-): 5 ' -actagtcgactaYTGI ATICCRCTI ATRWARTTCC AC AT-3 ' (SEQ ID NO 
71) 
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set L: • • 

HCPrl34(+): 5VCAT ATAGATGCCCACTTCCTATC-3' (SEQ ID NO 76) 
HCPril8(-): S'-actagtcgactaYTGIATICCRCTIATRWARTTCCACAT-S' (SEQ ID NO 71) 

setM: 1 ,»••." 

flCPr3(+): 5 '-GTGTGCCAGGACCATC-;} * (SEQ ID NO 77) and 
HCPr4(-): S'-GACATGCATGTCATGATGJA^ (SEQ ID NO 78) 

set N: i 
HCPr3(+): 5'-GTGTGCCAbGACCATC-3 , (SEQ ID NO 77) and 
, HCPrl 18(-): 5 9 -actagtcgactaYTGI ATICCRCTIATRWARTTCGAC AT-3 * (SEQ ID N071) 

setO: 1 ., 
HCPr3(+ ): S'-GTGTGCCXGGACCATC-S' (SEQ ID NO 77) and 
HCPr66 (-): 5 , -ctattaTTGTATCGCRCTGATGAARTTCCACAT-3 > (SEQ ID NO 70) 
No PCR products could be obtained with the sets of primers A, B, C, D, E, F, G, H, I, 
J, K, L, M, and N, on random-primed cpNA obtained from type 3 sera. With the primer set 
O, no fragmerit could be amplified from type 3 sera, however, a smear containing a few 
weakly stainable bands was obtained from serum BR36. After sequence analysis of several 
DNA fragments, purified and cloned from the area around 300 bp on the agarose gel, only 
one clone, HCC153 (SEQ ID NO 29), was shown to contain HCV information. This 
sequence was used to design primef IJCPrl52. 

A new primer set P was subsequently tested on several sera, 
set P: 

HCPrl52(+): 5 ^TACGCiCTCTTCTATATCGGTTGGGGCCTG-S 9 (SEQ ID NO 79) and 
HCPr66(-): 5 ' -CTATTATTGTATCCCRCTGATGAARTTCCAC AT-3 * (SEQ ID NO 70) 
The 464-bp HCPrl52/66 fragment was obtained from serum BR36 (BR36-20) and serum 
HD10 (HD10-1). The following clones were obtained from these PCR products: 

From fragment HD10-1: 
HD10-1-25 (SEQ ID NO 31), HDlO-1-3 (SEQ ID NO 33), 

From fragment BR36-20: 
BR36-20-164 (SEQ ID NO 35), BR36-20-165 (SEQ ID NO 37), BR36-20-166 (SEQ ID 
NO 39), 

The nucleotide sequences obtained from clones with SEQ ID NO 29, 31, 33, 35, 37 or 
39 are shown aligned with" the sequences of prototype isolates of other types of HCV in 
Figure 6. In addition to one silent 3rd letter variation, one 2nd letter mutation resulted in an 
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E to G substitution at position 175 of the deduced amino acid sequenqe ,of BR36 (Fig! 7). 
Serum HD10 clones were completely identical, The two type 3 isolates were nearly 94% 
homologous in this NS4 region. The homologies with other types are presented in Table 5. 

I Example 4: Analysi s of the anti-NS4 response to tvpe-specific peptides 
\ As the NS4 sequence contains the information for an important epitope cluster, ^nd since 
antibodies towards this region Seem to exhibit little cross-reactivity (Chan et al., 1991), it was 
I worthwhile to investigate the type-specific antibody response to this region. For each of the 
, 3 genotypes,, HCV-1 (Choo et al., 19?1), HC-J6 (Okamoto et al., 1991) and BR36 (present 
invention), three 20-mer peptides were synthesized covering the epitope region between amino 
acids 1688 and 1743 (as depicted in table 6). The synthetic peptides were applied as parallel' 
lines onto 'membrane strips. Detection of anti-NS4 antibodies and color development was 
performed according to the procedure described for the INNO-LIA HCV Ab ' II kit 
(Innogenetics, Antwerp). Peptide, synthesis was carried out on a 9050 PepSynthesizer 
(Millipore). After incubation with 15 LiPA-selected type 3 sera, 9 samples showed reactivity 
^ towards- ^S4 peptides of at l^ast 2 different typ^s, but a clearly positive reaction was 
observed for 3 sera (serum BR33, HD30 and DKH) on the type 3 peptides, while negative 
(serum BR33 and HD30) or indeterminate (serum DKH) on the type 1 and type 2 NS4 
peptides; 3 sera tested negative for anti-NS4 antibodies (Figure 8). Using the same membrane 
strips coated with the 9 peptides as indicated above and as shown in Figure 8, 38 type 1 sera 
(10, type la and 28 type lb), 11 type 2 sera (10 type 2a and 1 type 2b), 12 type 3a sera and 
2 type 4 sera (as determined by the LiPA procedure) were also tested. As shown in Table 8, 
the sera reacted in a genotype-specific manner with the NS4 epitopes. These results 
demonstrate that type-specific anti-NS4 antibodies can be detected in the sera of some 
patients. Such genotype-specific synthetic peptides might be employed to develop serotyping 
assays, for example a mixture of the nine peptides as indicated above, or combined with the 
NS4 peptides from the HCV type 4 or 6 genotype or from new genotypes corresponding to 
the region between amino acids 1688 and 1743, or synthetic peptides of the NS4 region 
between amino acids 1688 and 1743 of at least one of the 6 genotypes, combined with the El 
protein or deletion mutants thereof, or synthetic El peptides of at least one of the genotypes. 
Such compositions could be further extended with type-specific peptides or proteins, including 
for example the region between amino acids 68 and 91 of the core protein, or more 
preferably the region between amino acids 68 and 78. Furthermore, such type-specific 
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antigens may be advantageously used to improve current diagnostic screening and 
confirmation assays and/or HCV vaccines. . . • 

Example 5 The Core and El regions of HGV type 5 , 

Sample BE95 was selected from a group of sera that reacted positive in a prototype Line 
» Probe Assay as described earlier (Stuyver et aL , 1993), because a high-titer of HCV RNA 
■'• 1 could be detected, enabling cloning of fragments by a single round of PCR. As no sequences 
from any coding region of type 5 has been disclosed yet, synthetic oligonucleotides for PCR 
• amplification 1 were chosen in the regions of little sequence variation after aligning the 
sequences of HCV-1 (Choo et al., 1991), HCV-J (Katoet al., 1990), HC-J6 (Okamoto et 
al., 1991), HC-J8 (Okamoto et al., 1992), and the new type 3 sequences of the present 
invention liDIO, BR33, and BR36 (see Figure 5, Example 2). The .following sets of primers 
were synthesized on a 392 DNA/RNA synthesizer (Applied Biosystems): 
• Set 1: 

HCPr52(+): 5 ' -atgTTGGGTAAGGTCATCG ATACCCT-3 ' (SEQ ID NO 80) and 
\ HCPi'54(-): 5 ' -ctattaCC AGTTC ATC ATC ATATCCC A-3 ' (SEQ ID NO 78) 
Set 2: 

HCPr41(+): 5 '-CCCGGGAGGTCTCGTAGACCGTGCA-3 ' (SEQ ID NO 81) and 
HCPr40(-) : 5 ' -ctattaA AGATAGAGA AAGAGC AACCGGG-3 ' (SEQ ID NO 82) 
Set 3: 

HCPr41(+): 5 '-CCCGGGAGGTCTCGTAGACCGTGCA-3' (SEQ ID NO 81) and 
HCPr54(-): 5' -ccattaCC AGTTC ATC ATC ATATCCC A-3' (SEQ ID NO 78) 
The three sets of primers were employed to amplify the regions of the type 5 isolate PC 
as described (Stuyver et al., 1993). Set 1 was used to amplify the El region and yielded 
fragment PC-4, set 2 was designed to yield the Core region and yielded fragment PC-2. Set 
3 was used to amplify the Core and El region and yielded fragment PC-3. These fragments 
were cloned as described (Stuyver et al., 1993). The following clones were obtained from the 
PCR fragments: 

From fragment PC-2: 
PC-2-1 (SEQ ID NO 41), PC-2-6 (SEQ ID NO 43), 

From fragment PC-4: 
PC-4-1 (SEQ ID NO 45), PC-4-6 (SEQ ID NO 47), 
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From fragment PC-3: • • ■ 

PC-3-4 (SEQ ID NO 49), PC-3-8 (SEQ ID NO 51) 

An alignment of sequences with SEQ ID NO 4l, 43, 45, 47, 49 and 51, is given in Figure 
9. A consensus amino acid sequence (PC C/El; SEQ ID NO 54) can be deduced from each 
of the 2 clones cloned from each of the three PCR fragments as depicted in Figure 5, which 
overlaps the region between nucleotides 1 and 957 (Kato et al., 1990). The 6 clones* are very 
closely related to each other'(mutual homologies of about 99.1%). ' 

An alignment of nucleotide sequence with SEQ ID NO 53 or 151 (PC C/El from isolate 

i 

BE95) with known nucleotide sequences from the Core/El region is given in Figure 3. The 
clone is only distantly related to type 1, type 2, type 3 and type 4 sequences (Table 5). 

Example 6 ; NS3/NS4 region of HCV tvne g . 

Attempts were, undertaken to clone the NS3/NS4 region of the isolate BE95, described in 
example 5. The folllowing sets of primers were selected in the' regions of little sequence 
variability after, aligning the sequences of HCV-1 (Choo et al., 1991), HCV-J (Kato et al., 
1991), HC-J6 (Okamoto et al., 1991), and HC-J8 (OkamOto et al., 1992) and of the 
sequence's obtained' from type 3 sera of the present invention (SEQ ID NO 31, 33, 35, 37 and 
39); smaller case lettering is used for nucleotides added for cloning purposes: 
set A: 

HCPrl 16(+): 5'-ttttAAATACATCATGRCiTGYATG-3' (SEQ ID NO 66) 
HCPr66 (-): 5 ' -ctattaTTGTATCCCRCTGATG A ARTTCCAC AT-3 ' (SEQ ID NO 70) 
set B: 

HCPrll6(+): 5 ' -ttttA A ATACATCATGRCITG Y ATG-3 ' (SEQ ID NO 69) 
HCPrl 1 8(-): 5 ' -actagtcgactaYTGIATICCRCTIATRW ARTTCCAC AT-3 ' (SEQ ID NO 71) 
set C: . 

HCPrl 17(+): 5 '-ttttAA ATAC ATCGCIRCITGC ATGC A-3 ' (SEQ ID NO 72) 
HCPr66 (-): 5 '-ctattaTTGTATCCCRCTGATGAARTTCCACAT-3 ' (SEQ ID NO 70) 
set D: 

HCPrl 17(+): 5 '-ttttAA ATAC ATCGCIRCITGCATGC A-3' (SEQ ID NO 72) 
HCPr 1 1 8(-) : 5 ' -actagtcgactaYTGIATICCRCTIATRW ARTTCCAC AT-3 ' (S EQ ID NO 71) 
set E: 

HCPrl 16(+): 5'-ttttAAATACATCATGRCITGYATG-3' (SEQ ID NO 69) 

HCPrl 19(-): actagtcgactaRTTIGCIATIAGCCG/TRTTCATCCAYTG-3' (SEQ ID NO 73) 
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set F: • • ■ 

HCPrll7(+): 5 ' -ttttAA ATACATCGCIRCITGCATGCA-3 ' (SEQ ID. NO 72) 
HCPrl 19(-): actagtcgactaRTTIGClATIAGCCG/TRTTCATCCAYTG-3' (SEQ ID NO 73) 

setG: 

( HCPrl31(+); 5'-ggaattctagaCCITCITGGGAYGARAYITGGAARTG-3 > (SEQ ID N074) 
'» '• HCPr66 (-): 5 ' -ctattaTTGTATCCCRCTG ATG A ARTTCC AC AT-3 ' (SEQ ID NO 70) 
set H: 

• HCPrl30(+j: 5 ' -ggaattctag ACIGCITAYCARGCI ACIGTITGYGC-3' (SEQ ID NO 75) 
HCPr66 (-): 5 ' -ctattaTTGTATCCCRCTG ATG A ARTTCC AC ATi3 ' (SEQ ID NO 70) 
set I: 

HCPrl34(+): 5 ' -C ATATAG ATGCCC ACTTCCT ATC-3 ' (SEQ ID NO 76) 

HC Pr66 (-): 5 '-ctattaTTGTATCCCRCTGATGAARTTCCACAT-3 ' (SEQ ID NO 70) 

set J: ' 
:HCPrl31(+): 5 ' -ggaattctagaCCITCITGGGAYGARAYITGGAARTG-3 ' (SEQ ID 74) 
HCPrl 18(-): S'-actagtcgactaYTGIATICCRCTIATRWARTTCCACAT-S' (SEQ ID N071) 

t ' • set K: 

HCPrl30(+): 5 ' -ggaattctagACIGCITA YCARGGIACIGTITGYGC-3 * (SEQ ID NO 75) 
HCPrl 1 8(-) : 5 ' -actagtcgacta YTGIATICCRCTI ATRWARTTCC AC AT-3 ' (SEQIDN071) 
set L: 

HCPrl34(+): 5 ' -C ATATAG ATGCCCACTTCCTATC-3 ' (SEQ ID NO 76) 
HCPrl 1 8(-): 5 -actagtcgacta YTGIATICCRCTIATRWARTTCCACAT-3' (SEQ ID N071) 
set M: 

HCPr3(+): 5 '-GTGTGCCAGGACCATC-3 ' (SEQ ID NO 77) and 
HCPr4(-): 5 ' -G AC ATGC ATGTC ATG ATGTA-3 ' (SEQ ID NO 78) 
set N: 

HCPr3(+): 5 '-GTGTGCCAGGACC ATC-3 ' (SEQ ID NO 77) and 
HCPr 1 1 8( ) : 5 '-actagtcgactaYTGIATICCRCTI ATRWARTTCC ACAT-3' (SEQ ID NO 
71) 

set O: 

HCPr3(+): 5 '-GTGTGCCAGGACCATC-3 ' (SEQ ID NO 77) and 

HCPr66 (-): 5 ' -ctattaTTGTATCCCRCTGATGAARTTCC ACAT-3 * (SEQ ID NO 70) 

No PCR products could be obtained with the sets of primers A, B, C, D, E, F, G, 
H, I, J, K, L, M, and N, on random-primed cDNA obtained from type 3 sera. However, 
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set O yielded what appeared to be a PCR, artifact fragment estimated about 1450 base 
pairs, instead of the expected 628 base pairs. Although it is not expected that PGR artifact 
fragments contain information of the gene or genome that was targetted in the experiment, 
efforts were put in cloning of this artifact fragment, which was designated fragment PC-1. 
The following elopes,, were obtained from fragment PC-1: 

PC-1 T 37 (SEQ ID NO 59 and'sEQ ID NO 55), PC-1-48 (SEQ ID NO 61 and SEQ ID NO 

57).' '. ' 

The sequences obtained from the 5' and 3' ends of the clones are given in SEQ ID NOS 
55, 57, 59, and 61, and the complete sequences, with SEQ ID NO 197 and 199 are shown 
aligned with the sequences of prototype isolates of other types of HCV in Figure 10 and the 
alignment of the deduced amino acid sequences is shown in Figure 11 and 7. Surprisingly, 
the PCR artifact clone contained HCV information. The positions of the sequences within the 
HCV genome are compatible with a' contiguous HCV sequence of 1437 nucleotides, which 
was the estimated size of the cloned PCR artifact fragment. Primer HCPr66 primed correctly 
at the expected position in the HCV genome. Therefore, primer HCPr3 1 must have 
incidentally misprimed at a position 809 nucleotides upstream of its legitimate position in the 
HCV genome. This could not be expected since no sequence infoi'mation was available from 
a coding region of type 5. 

Example 7 ; The E2 region of HCV type 5 

Serum BE95 was chosen for experiments aimed at amplifying a part of the E2 region of HCV 
type 5. 

After aligning the sequences of HCV-1 (2), HCV-J(l), HC-J6 (3), and HC-J8 (4), PCR 
primers were chosen in those regions of little sequence variation. 

Primers HCPrl09(+): 5 '-TGGGATATGATGATGAACTGGTC-3 ' (SEQ ID NO 141) and 
HCPrl4(-): 5 '-CCAGGTACAACCGAACCAATTGCC-3' (SEQ ID NO 142) were combined 
to amplify the aminoterminal region pf the E2/NS1 region, and were synthesized on a 392 
DNA/RNA synthesizer (Applied Biosystems). With primers HCPrl09 and HCPrl4, a PCR 
fragment of 661 bp was generated, containing 169 nucleodtides corresponding to the El 
carboxyterminus and 492 bases from the region encoding the E2 aminoterminus. 

An alignment of the type 5 E1/E2 sequences with seq ID NO. 158 with known sequences is 
presented in Figure 10. The deduced protein sequence was compared with the different 
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genotype^ (Fig, 12, amino acids 328-546). In the' El region, there were no extra structural 
important motifs found. The aminoterminal part of E2 was hypervariable when compared 
with the* other genotypes. All 6 N-glycosylation sites and all 7 cysteine residue's were 
conserved in this E2 region. To preserve alignment, it was necessary to introduce a gap 
between aa 474 and 475. as for tjpe 3a, but nqt between aa 480 and 481, as for typp 2. 

• ' ■■ ■ . ■ • . . • ' 

Example 8 : The NS5b region of H CV type 4 

Type 4 sera GB48, GB116, GB215, and GB358, selected by means of the line probe assay 
(LiPA, Stuyver et al., 1993), as well as sera GB549 and GB 809 that could not be typed by 
means of this LiPX (only hybridization was observed with the universal probes), were 
selected from Gabonese patients. All these sera were positive after the first round of PCR 
reactions for the 5' untranslated region (Stuyver, et al., 1993) and were retained for further 
study. 1 ' 

» • 

RNA was isolated from the sera and cDfrlA synthesized as described in example 1. 
Universal primers in the NS5 region were selected after alignment of the published sequences 
as follows: ■■ 

HCPr206(+): 5'-TGGGGATCCCGTATGATACCCGCTGCTTT6A-3' 
(SEQ ID NO. 124) and 

HCPr207(-): 5 '-GGCGGAATTCCTGGTCATAGCCTCCGTGAA-3 ' 
(SEQ ID NO. 125);' 

and were synthesized on a 392 DNA/RNA synthesizer (Applied Biosystems). Using the Line 

Probe Assay (LiPA); four high-titer type 4 sera and 2 sera that could not be classified were 

selected and subsequently analyzed with the primer set HCPr206/207. NS5 PCR fragments 

obtained using these primers from serum GB48 (GB48-3), serum GB116 (GB116-3), serum 

GB215 (GB215-3), serum GB358 (GB358-3), serum GB549 (GB549-3), and serum GB809 

(GB809-3), were selected for cloning. The following sequences were obtained from the PCR 

fragments: ' ; 

From fragment GB48-3 : GB48-3-10 (SEQ ID NO. 106) 

From fragment GB116-3: GB1 16-3-5 (SEQ ID NO. 108) 

From fragment GB215-3: GB215-3-8 (SEQ ID NO. 110) 

From fragment GB358-3: GB358-3-3 (SEQ ID NO. 1 12) 

From fragment GB549-3: GB549-3-6 (SEQ ID NO. 114) 
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From fragment GB809-3: GB809-3-1 (SEQ ID NO. 116) ' 
An alignment of nucleotide sequences with SEQ ID NO. 106, 108, 110, 112, 114, and 116 
with known sequences is given in Figure 1. An alignment of deduced amino acid sequences 
with SEQ ID NO. 107, 109, 111, 113, 115,and 117 with known sequences is given in Figure 
^ 2. The 4 isolates that had been typed as type 4 by means of LiPA are very closely related to 
; each other (mutual homologies of about 95%), but are only distantly related to type 1, type 
2, and type 3, sequences (e.g. GB358 show's homologies of 65.6 to 67.7% with other 
genotypes, Table 4). The sequence obtained from sera GB549 and GB809 also show similar 
homologies with genotypes 1, 2, and,3 (65.9 to 68.8% for GB549 and 65.0 to 68.5% for 
GB809, Table 4), but an intermediate homology of 79.7to 86.8% (often observed between 
subtypes of ,the same type) exists between GB549 or GB809 with , the group of isolates 
consisting of GB48, GB116, GB215, and GB358, or between GB549 and GB 809. These data 
indicate the discovery of 3 new, subtypes within the HCV genotype 4: in the present 
invention, these 3 subtypes are designated subtype 4c, represented by isolates GB481 GB116, 
GB215, and GB358, subtype 4g, represented by isolate GB549, and subtype 4e, represented 
by 1801316,08809. Although the homologies observed between subtypes in' the NS5 region 
seem to indicate a closer relationship between subtypes' 4c and 4c, the homologies observed 
in the El region indicate that subtypes 4g and 4e show the closest relation (see example 8). 

Example 9 : The Cnre/ El region of HCV type A 

From each of the 3 new type 4 subtypes, one representative serum was selected for cloning 
experiments in the Core/El region. GB549 (subtype 4g) and GB809 (subtype 4e) were 
analyzed together with isolate GB358 that was chosen from the subtype 4c group. 
Synthetic oligonucleotides: 

After aligning the sequences of HCV-1 (2), HCV-J(l), HC-J6 (3), and HC-J8 (4), PCR 
primers were chosen in those regions of little sequence variation. 

Primers HCPr52(+): 5 ' -atgTTGGGTA AGGTC ATCG ATACCCT-3 ' , HCPr23(+): 5'- 
CTCATGGGGTACATTCCGCT-3 ' , and HCPr54(-): 5 * - 
CTATTACCAGTTCATCATCATATCCC A-3 ' , were synthesized on a 392 DNA/RNA 
synthesizer (Applied Biosystems). The sets of primers HCPr23/54 and HCPr52/54 were used, 
but only with the primer set HCPr52/54, PCR fragments could be obtained. This set of 
primers amplified the sequence from nucleotide 379 to 957 encoding amino acids 127 to 319: 
65 amino acids from the carboxyterminus of core and 128 amino acids of El. The 
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amplification products GB358-4, GB549-4, and GB809-4 Were cloned as described in example 
1. The following clones were obtained from the PCR fragments: 
From fragment GB358-4: GB.358-4-1 (SEQ ID NO 118) 
From fragment GB549-4: GB549-4-3 (SEQ ID NO 120) 
( From fragment GB809 r 4: GBIB09-4-3 <SEQ ID NO 122) 

1 An alignment of the type 4 Core/El nucleotide sequences with seq ID NO. 118, 120, and 122 
with known sequences is presented in Figure 4. The homologies of the type 4 El region 
' (without core) with type 1, type 2, type 3, and type 5 prototype sequences are depicted in 
• Table 4. Homologies of 53 to 66% are observed with representative isolates of non-type 4 
genotypes. Observed homologies in {he El region within type 4, between the different 
subtypes, ranges from 75.2 to 78.4%. The recently disclosed sequences of the core region 
of Egyptian 1 type 4 isolates (for example EG-29 in Figure 3) described by Simmonds et al. 
(1993) do not allow alignment with the Gabonese sequences (as described in the present 

invention) in the NSB region and may belong to different type 4 subtypes(s) as can be 

... . | . 

deduced from the core sequences. The deduced amino acid sequences with SEQ ID NO 1 19, 
\ 121, and 1 123 are aligned with other prototype sequences in Figure 5. Again, type-specific 
variation mainly resides in the variable V regions, designated in the present invention, and 
therefore, type-4-specific amino acids or V regions will be instrumental in diagnosis and 
therapeutics for HCV type 4. . . 

Example 10 : The Core/El and NS5b regions of new HCV type 2. 3 and 4 subtypes 

Samples NE92 (subtype 2d), BE98 (subtype 3c), CAM600 and GB809 (subtype 4e), 
CAMG22 and CAMG27 (subtype 4f), GB438 (subtype 4h), CAR4/1205 subtype (4i), 
CAR1/501 (subtype 4j), CAR1/901 (subtype 4?), and GB724 (subtype 4?) were selected from 
a group of sera that reacted positive but aberrantly in a prototype Line Probe Assay as 
described earlier (Stuyver et al., 1993). Another type 5a isolate BE100 was also analyzed in 
the C/El region, and yet another type 5a isolate BE96 in the NS5b region. A high-titer of 
HCV RNA could be detected, enabling cloning of fragments by a single round of PCR. As 
no sequences from any coding region of these subtypes had been disclosed yet, synthetic 
oligonucleotides for PCR amplification were chosen in the regions of little sequence variation 
after aligning the sequences of HCV-1 (Choo et al., 1991), HCV-J(Kato et al., 1990), HC-J6 
(Okamoto et al., 1991), HC-J8 (Okamoto et al., 1992), and the other new sequences of the 
present invention. 
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The above mentioned sets 1, 2 and 3 (see example 5) of primers were used, but only with 
set 1, PCR fragments could be obtained from all isolates (except for BE98, GB724, and 
CARl/5pi). This set of primers amplified the sequence from nucleotide 379 to 957 encoding 
amino acids 127 to 319: 65 amino acids from die carboxyterminus of core and 128 amino 
acids of El. With set 3, .the core/El region from isolate NE92 and BE98 could be amplified, 
and with set 2; the core region of QB358, GB724, GB809, and CAM600 could be amplified. 
The amplification products were cloned as described in example 1. The following clones were 
obtained from the PCR fragments: 

From isolate GB724, the clone with SEQ ID NQ 193 from the core region. 
From isolate NE92", the clone with SEQ ID NO 143 

From isolate BE98, the clone from the core/El region of which part of the sequence has been 
analyzed and is given in SEQ ID NO 147, ; 

From isolate CAM600, the clone with SEQ ID NO 167 from the El region, or SEQ ID NO 
165 from the Core/El region as shown in Figure 3 ' ' 

From isolate CAMG22, the clone with SEQ ID NO 171 from the El region-* shown in 
Figure' 4, 

from isolate GB358, the clone with SEQ ID NO 191 in the core region,, 
from isolate CAMG27, the clone with SEQ ID NO 173 from the core/El region, 
from isolate GB438, the clone with SEQ ID NO 177 from the core/ El region, 
from isolate CAR4/1205, the clone with SEQ ID NO 179 from the core/El region, 
from isolate CAR1/901, the clone with SEQ ID NO 181 from the core/ El region, 
from isolate GB809, the clone GB809-4 with SEQ ID NO 189 from the core/El region, 
clone GB809-2 with SEQ ID NO 169 from the core/El region and the clone with SEQ ID 
NO 163 from the core region, 

and from isolate BE100, the clone with SEQ ID NO 155 from the Core/El region as shown 
in Figure 4. 

An alignment of these Core/El sequences with known Core/El sequences is presented in 
Figure 4. The deduced amino acid sequences with SEQ ID NO 144, 148, 164, 168, 170, 172, 
174, 178, 180, 182, 190, 192, 194, 156, 166 are aligned with other prototype sequences in 
Figure 5. Again, type-specific variation mainly resides in the variable V regions, designated 
in the present invention, and therefore, type 2d, 3c and type 4-specific amino acids or V 
regions will be instrumental in diagnosis and therapeutics for HCV type (subtype) 2d, 3c or 
the different type 4 subtypes. 
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The NS5b region of isolates NE92, BE98, CAM600, CAMG22, GB438, CAR4/1205, 
CAR1/501, and BE96 was amplified with primers HCPr206 and HCPr207 (Table 7). The 
corresponding clones were cloned and sequenced as in example 1 and the corresponding 
sequences (of which BE98 was partly sequenced) received the following identification 
numbers: '■ , • , ' , • 

r ■ 

NE92: 9EQ ID NO 145 . , 

BE98: SEQ ID NO 149 

CAM600: SEQ ID NO 201 

CAMG22: SEQ ID NO 203 

GB438: SEQ ID NO 207 , 

CAR4/1205: SEQ ID NO 209 

CAR1/501: SEQ ID NO 211 , 

BE95: SEQ ID NO 159 ' 

BE96: SEQ ID NO 161 

i 

An- alignment 6f these NS5b sequences with known NS5b sequences is presentee! in Figure 
1. The deduce*} amino acid sequences with SEQ ID NO 146, 150, 202, 204, 206, 208, 210, 
212,160, 162 are aligned with other prototype sequences in Figure i Z. Again, subtype-specific 
variations can be Observed, and therefore, type 2d, 3c and type 4-specific amino acids or V 
regions will be instrumental in diagnosis and therapeutics for HCV type (subtype) 2d, 3c or 
the different type 4 subtypes. 

<* . 

Example 11 : Genotvpe-specific reactivity of anti-El antibodies (Serotvping) 

El proteins were expressed from vaccinia virus constructs containing a core/El region 
extending from nucleotide positions 355 to 978 (Core/El clones described in previous 
examples including the primers HCPr52 and HCPr54), and expressed proteins from LI 19 
(after the initiator methionine) to W326 of the HCV polyprotein. The expressed protein was 
modified upon expression in the appropriate host cells (e.g. HeLa, RK13, HuTK-, HepG2) 
by cleavage between amino acids 191 and 192 of the HCV polyprotein and by the addition 
of high-mannose type carbohydrate motifs. Therefore, a 30 to 32 kDa glycoprotein could be 
observed on western blot by means of detection with serum from patients with hepatitis C. 

As a reference, a genotype lb clone obtained form the isolate HCV-B was also expressed 
in an identical way as described above, and was expressed from recombinant vaccinia virus 
wHCV-llA. 
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A panel of 104 genotyped sera was first tested for reactivity with a cell lysate containing 
type lb protein expressed from the recombinant vaccinia virus wHCV-llA, and compared 
with cell lysate of RK13 cells infected with a wild type vaccinia virus ('El/WT'). The lysates 
were coated as a 1/20 dilution on a normal ELISA microtiter plate ( (Nunc maxisorb) and left 
to react with a 1/20 dUuation of the respective sera. The panel consisted of 14 type la, 38 
, type lb, 21 type 2, 21 type 3a, and 9 type 4 sera. Human antibodies were subsequently 
,. detected by a goat anti-human' IgG conjugated with peroxidase and the enzyme activity was 
I .detected. The optical density values of the El and wild type lysates were divided and a factor 
, 2 was taken as the cut-off. The results, are given in the table A. Eleven out of 14 type la sera 
(79%), 25 out of 38 type lb sera (66'%), 6 out of 21 (29%), 5 out of 21 (24%), and none of 
the 9 type 4 or the type 5 serum reacted (0%). These experiments clearly show the high 
prevalence,of anti-El antibodies reactive wifh the type 1 El protein in patients infected with 
type 1 (36/52 (69%)) (either type ,1a or. type lb), but the low prevalence or absence in non- 
type 1 sera (11/52 (21%)). 

TABLE A 



serum 


El/WT 


type la 




3748 


3.15 


3807 


3.51 


5282 


1.99 


■ 9321 


3.12 


9324 


2.76 


9325 


6.12 


9326 


10.56 


9356 


1.79 


9388 


3.5 


8366 


10.72 


8380 


2.27 


10925 


4.02 


10936 


5.04 


10938 


1.36 
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i 8.72 




10929 


8.26 




10931 


'2.3 




10932 i 


4.41 




44 , 


2.37 




45 ( ' ,, 


3,. 14 




46 


4.37 




47 


5.68 




48 


2.97 




49 


1.18 




50 


9. §5 




CI 

D 1 


H.J 1 




52 


1.11 




53 


5.20 




54 


0.98 


i 


55 


1.48 




56 


1.06 




57 


3.85 




58 


7.6 




59 


3.28 




60 


3.23 




61 


7.82 




62 


1.92 
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type 2 




23 


0.91 


24 


1.16 


25 


2.51 


26 


0.96 


'27 

, i 


1.20 


28 


0 96 


29 


2.58 


30 


8 05 


31 


0.92 


32 


0 82 


33 ' 


5 75 


34 


0 7Q 


35 


\J. OU ( 


36 . 


0.85 1 


37 . , 


0.76 


38' 


0.92 


39 


1.08 


40 


2.33 


41 


2.83 


42 


1.21, 


43 i , 


0.91 


tvne 3 




1 


6 88 


2 


1 47 


3 


-J. V/vJ 


4 


6 52 


5 


10 24 


6 


2 72 


7 


1 1 1 

1.11 


8 


1 54 


9 


1 60 


10 


1 21 


11 


1 07 


12 


1 00 


13 


0 85 


14 


0.96 


15 


0.51 


16 


1.00 


17 


1.09 


18 


0.99 


19 


1.04 


20 


1.04 


21 


0.96 
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type 4 




22 


0.87 


Lir>4o 




GB113 


0.68 


OBI lo 


U. / J 


.GB215 • 


0.52' 


GB358 


0.56 


GB359 


0.71 


GB438 i 


1.08 


GB516 


1.04 


type 5 




BE95 


0.86 



Core/El clones of isolates BR36 (type 3a) and BE95 (type 5a) were Subsequently recombined 

into the viruses wHCV-62 and wHCV-63, respectively. A genotyped panel of spra was 

subsequently tested onto cell lysates obtained from RK 13 cells infected with thej recombinant 

viruses wHCV-62 and wHCV-63. Tests were carried out as described above, and the results 
i 1 
* are given iij the table given.below (TABLE B). From ihe^e results, it can clearly be seen that, 

although some cross-reactivity occurs (especially between type 1 and 3), the obtained values 

of a given serum are usually higher on its homologous El protein than on an El protein of 

another genotype. For type 5 sera, none of the 5 sera 1 were reactive on type 1 or 3 El 

proteins, while 3 out of 5 were shown to contain anti-El antibodies when tested on their 

homologous type 5 protein. Therefore, in this simple test system, a considerable number of 

sera can already be serotyped. Combined with the reactivity to type-specific NS4 epitopes or 

epitopes derived from other type-specific parts of the HCV polyprotein, a serotyping assay 

may be developed for discriminating the major types of HCV. To overcome the problem of 

cross-reactivity, the position of cross-reactive epitopes may be determined by someone skilled 

in the art (e.g. by means of competition of the reactivity with synthetic peptides), and the 

epitopes evoking cross-reactivity may be left out of the composition to be included in the 

serotyping assay or may be included in sample diluent to outcompete cross-reactive 

antibodies. 
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Table 5 . Homologies of new HCV sequences with other known HCV types 



Region 1 
(nucleotides) 


Isolate 
(type) 


la 

HCV-1 1 


lb 
HCV-J 


2a 
IJC-J6 , 


2b 
HC-J8 


3a 

Tl T7 


3b 

T9 T10 


Core (1-573) 


PC (5) • 


83.8 (91.6) 


84.8 (92.1) 


82.6 (90.1) 


82.4 (89.0) 






El (574-9$7) 

i 


HD10 (3) 
BR36 (3) 
BR33 (3) 
PC (5) ; ( 
GB358 (4a) 
GB549 (4b) 
GB809 (4c) 


r 

61.5 (68.0) 
62.0 1(66.4) 
60.7 (67.2) 

61.4 (64,0) 

62.5 (69.1) 
66.0 (72.2) 
63.3 (69,1) 


64.6 (68.8) 
'62.5 (67.2) 

63.3 (68.0) 

62.4 (64.8) 
62.8 (65.9) 
62.8 (69.8) 

60.7 (64.3) 


57.8 (55.5) 
56.5 (53.9) 
56.5 (54.7) 
54.1 (49.6) 
59.4 (54.0) 
59.1 (56.4) 
56,7 (53.2) 


56.3 (59.4) 
55.2 (58.6) 
56.0 (58.6) 
53..3 (47.2) 

54.4 (54.0) 

56.5 (54.0) 
53:0 (51.6) 




: 1 

■ 


NS3 

(3856-4209) 


PC (5) 


74.7 1 (89) 
i 


76.1 (86.4) 

i 

i 


76.1 (89.8) 


78.0 (89.0) 






NS4 

(4892-5292) 


BR36 (3) 
HD 10 (3) 


67.8 (78.5) 
(74.6) 


69.8 (75.1) 
66.6 (69.7) 


62..0 (67.5) 
57!8 (59.9) 


61.7 (66.0) 
59.1 (59.9) 






NS4 

(4936-5292) 


PC (5) 


^1.3 (62.2) 


63.0 (65.5) 


52.9 (46.2) 


54.3 (43.7) 

♦ 1 


i 




NS5b 

(8023-8235) 
• 


BI04 ,(3) 
BR36 (3) 
BR33 (3) 
CjB358 (4a) 
GB549 (4b) 
GB809 (4c) 


65.7 
64.3 
65.7 

67.7 (76,1) ' 

68.8 (76.1) 
68.5 (73.5) 


66.7 
67.6 
67.1 

65.6 (77.0) 
67.1 (77.0) 
65.0 (73.5) 


63.9 
64.8 
64.3 

66.5 (70.8) 
65.9 (7i:7) 
67.7 (69.9) 


64.3 
66.7 
64.8 

65.6 (71.7) 
65.9 (74.4) 

67.7 (73.5) 


94.8 93.9 
94.8 93.4 
94.8 93.9 


75.6 77.0 
75.1 76.5 
76.0 77.5 



Shown are the nucleotide homologies (the amino-acid homology is givein between brackets) 
for the region indicated in the left column. 
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Table 6. NS4 seguence s of the different ? t»nntY P >« 



PCT/EP94/01323 



"1 



prototype 


TYPE 


SYNTHETIC PEPTIDE NS4-1 
(NS4a) 


SYNTHETIC PEPTIDE NS4-5 
(NS4b) 


SYNTHETIC PEPTIDE NS4-7 
(NS4b) 


position- > 




1 • 1 
6 7 
9 0 
0 o . 


1 1 

' 7 7 

. 2 3 
1 0 o 


1 1 1 
7 - 7 
3 4' 
0 0 


HCV-1 


la 


1 

* * * +* ** 

LSG KPAUPDREV LYREFDE 


SQHLPYIEQ GMMLAEOFKfl K 


LAEQFKQ KALGLLQTAS RQA 


HCV-J 


,1b 


LSG RPAVIPDREV LYQEFDE 


♦ 

ASHLPYIEQ GMQLAEQFKQK 


LAEQFKQ KALGLLQTAT KQ A 


HC-J6 , 


2a 


» 

VNO RAVVAPDKEV LYEAFDE 


ASRAALDEE GQRIAEMLKS K 


IABMLKS KIQGLLQQAS KQA 


HC-J8 




LND R VVV APDKEI LYEAFDE 


ASKAAUEE GQRMAEMLKSK 


MAEMU^ KIQGLLQQAT RQA 


BR36 


3a ( 


i 

LGG KPAJVPDKEV LYQf2 YDE 

* 


SQAAPYIEQ AQVLAJiQFKE* K 

i 


> • 

IAHQFKE KKLGLLQRAT QQQ 


PC 


5 


LSG KPAJIPDREA LYQ£ FDE 
V 


AASLPYMDE TRAIAGOFKg K 


IAGQFKE K V LGFISTTG QKA 



\ residues conserved in every genotype.. Underlined amino acids are type-specific, 
acids in italics are unique to type 3 and 5 sequences. 



amino 
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Table 7 



NO 


rnmer 
(polarity) 


1 ^ t 

oemjcute 1 rum -> tu j 


63 


HCPrl61(+) 


5 ' ACCGGAGGCC AGGAGAGTGATCTCCTCC-3 ' 


64 I 


HCPrl62(-) 


5 '-gggctgctctatcctcatcgAcgccatg-3 ' . 


65 


HCPrl63(+) 


5 ' rGCC AGAGGCTCGGAAGGCGATCAGCGCT-3 ' 


66 


HCPrl64(-) 


5'-GAGGTGCTCTGTCCTCCTCGACGCCGCA-3' 


67 


HCPr23(+) 


5 ' -CTC ATGGGGT AC ATTCCGCT-3 ' 


68 


HCPr54(-) 


5 ' -CTATTACC AGTTCATCAT CATATCCCA-3 ' 


69 


HCPrll6(+) 


5 ' -ttttAAAT AC ATCATGRCITGYATG-3' 


70 


HCPr66(-) 


5 ' -ctattaTTGT ATCCCRCTGATGAARTTCC ACAT-3 * 


71 


HCPiil8(-) 


5 ' actagtcgactaYTGI ATICCRCTIATRWARTTCC AC AT-3 ' 


72- 


HCPrll7(+) 


5 ' -ttttAAAT AC ATCGCIRCITGCATGCA-3 ' 


73"- 


HCPrll9(-) 


_i ! — r 

5 ' -actagtcgactaRTTIGCIATIAGCCKRTTC ATCC AYTG-3 ' 


74 


HCPrl31(+) 


5 ' -ggaattctagaCCITCITGGG AYGARAYTTGGAARTG-3 ' 


75" 


HCPrl30(+) 


5 ' -ggaattctag ACIGCITAYC ARGCIACIGTITGYGC-3 ' 


76 


HCPrl34(+) 


5 ' -C AT ATAG ATGCCCACTTCCT ATC-3 ' 


77 


HCPr3(+) 


5 ' -GTGTGCCAGGACCATC-3 ' 


78 


HCPr4(-) 


5 ' -GAC ATGC ATGTC ATGATGTA-3 ' 


79 ' 


HCPrl52(+) 


5 ' -T ACGCCTCTTCT AT ATCGGTTGGGGCCTG-3 ' 


80 


HCPr52(+) 


5'-atgTTGGGTAAGGTCATCGATACCCT-3 ' 


81 


HCPr41(+) 


5 ' -CCCGGGAGGTCTCGTAGACCGTGC A-3 ' 


82 


HCPr40(-) 


5 ' -ctattaAAGATAGAGAAAGAGC AACCGGG-3 ' 


124 


HCPR206 


5 ' -tggggatcccgtatgatacccgctgctttga-3 ' 


125 


HCPR207 


5 ' -ggcggaattcctggtcatagcctccgtgaa-3 ' 


141 


HCPR109 


5 ' -tgggatatgatgatgaactggtc-3 ' 


142 


HCPR14 


5 ' -ccaggtacaaccgaaccaattgcc-3 ' 



BNSDOCID: <WO 9425601 A2_l_> 



SUBSTITUTE SHEET (RULE 26) 



WO 94/25601 



72 



PCT/EP94/01323 



§ 

^* 

00 

1 



a. 



ft, 



1^3 



.« "i i i 

-f 1 • <^ <s ~ 



4 7 ^ 01 -P 



m <N m n M m (<) m ^ ^ 



<N >■ <N 



i i i <S 



.ch m m . m cn m m m 



ooooooooo^ 



BNSDOCID: <WO 9425601 A2_L> 



SUBSTITUTE SHEET (RULE 26) 



- WO 94/25601 PCT/EP94/01323 

73 

• it i 



1 I 

i 





















1 














t 








i 


co 


co 


co 


ro 


*— * 


1 


CO 


cm 


t 


CO 






CM 


3 NS4" 


1 




i 


i 


i 


i 


i 


i 

i 


t 


1 

+ 




1 


CM 


»— ■ 


-7 


1 


p 






i 


i 


• . 




1 r 


. 
i 


1 


1 


+ 


1 


cm 


(N 




1 




\ 

\ 




i 


CM 




CM 


CO 




1 

+ 


CM 


CM 


• 

+ 


CM 


1 

CM 


cm 


cm 


1 NS4 






































i 


i 


i 


+ 


+ 


1 


1 

+ 


1 

+ 


1 

-r 


1 

+ 


CM 






+ 


Type 






i 


i 


i 






1 


CO 


i 


1 

H- 


CO 
t 


1 

+ 


1 


1 










i 




co 


to 


co 


CO 


1 


CO 


CM 


CO 


CO 




cm 


CO 


S4 


































z 






+ 


CM 


co 


co 


co 


CO 


1 


cm 


CM 


CO 


CO 


CO 


CO 


CO 




































Type 






i 

+ 


i 


CM 


cm 


CO 


CO 


CO 




i 

+ 


1 


CO 


CO 


CO 


CO 




1 serum 


type lb 




cm 


co 




wo 


SO 






ON 


o 
cm 


cm 


CM 

cm 


CO 
CM 


^» 

CM 



BNSDOCID: <WO 9425601 A2_l_> 



SUBSTITUTE SHEET (RULE 26) 



WO 94/25601 



74 



PCT/EP94/01323 



> t 



' -j- If <^ i • _fr r m i m 



• • i i cn 



C/3 

z 

a. 



.+ + 



'» ^ ~ ^ 



^ "7 f ^ • i i ^ i f CN 



+ 



i i 

+ + 



^ W M i ~ 



+ + 



CN 



P 4^ » ^ <N csj , i 



CO 

z, 

CN 

Q. 



i i 

^ i + + + CN 



o m ^ cm m » ^ i-4 , 



i t i 



t i i 

+ + +* 



CN i i *— 



j"* " 



cn cn ^ m cn 



1 t cn cn m m 



• 1 » ' + i .i • 



CO 

Z 



cn cn cn cn cn cn • 



■ cn cn cn cn cn 



1 1 ^ ^ 4^ 1 



A. 



m ^ m m cn i 



i cn i cn i cn 



> <N 



CN 



SUBSTITUTE SHEET (RULE 26) 

BNSDOCID:<WO 9425601 A2_l_> 



WO 94/25601 



PCT/EP94/01323 



75 



CO 

Z 




i 

• 

+ 


» 

i n M i .im.cn i <N <N cn m cS 
• 




to 


■ 

• . + 


cn i i, i— ^ i N N w M n 

i 






• 


i 

i i i 
cn cn i cn »— • i m i i^p 


i 

i 

• + 


Type 2 NS4 




t 

1' » 


• *. 

^ i i i CS <N i .i (SI ^ • 


• + 


ui 




+ . • ■ +" ^ • ^r — - 






rn 


<* . 

1 • 
•^llllll_pllll 


i i 


Typel NS4 




i 

+ 


1 

^ • i • cn cn i i i i i i 


i i 




i 

+ 


i i i i— i cN i i i i i i 


i . i 












serum 


type 2b 
149 


cn 


type 4 

162 
163 



SUBSTITUTE SHEET (RULE 26) 

BNSDOCID: <WO 9425601 A2J_> 



WO 94/25601 

PCT/EP94/01323 

» 76 ' 

REFERENCES , 

' ' ' ' • ' • , 

Barany F (1991). Genetic disease detection and DNA amplification using cloned thermostable 
ligase. Proc Natl Acad Sci USA 88: 189-193. 

■\ Bej A, Mahbubani M, Miller R, Di Cesare HafT L. Atlas R (,990) Mutipte PCR 
■ ■ antphficanon and Mobilized capture probes for detection of bacteria, pathogens and 
1 .indicators in water. Mol Cell Probes 4:353-365. 

Bukh ,, Pureed R, MiUer R (,992). 4ence analysis ofthe 5- noncoding region of hepatitis 
C virus. Proc Natl Acad Sci USA 89:4942-4946. 

Bukh J, Purcell R, Miller R (1993). At least 12 genotypes ... PNAS 90,8234-8238. ' 

Cha T, Beal E, Irvine B, Kolberg ,, Chien D. Koo G. Urdea M (,992) At leas, f,ve rela«ed 
, bo, drstlnct, hepatitis C viral genotypes exist Proc Na„ Acad Sci USA 89:7144-7148. 

Cban S-W. SimntoodsP, McOmish F, Yap P, MitcbeU R, DowB. FolletiE (,99,) Serological 
responses to infection whh three different rypes of hepatitis C virus. Lance, 338:,99,. 

Chan S-W. McOmish F, Hohnes E. Do W Bj p eutherer % Yap p p ^ 

Analyse of a new hepatitis C virus type and its phylogenetic relationship to existing variants 
J Gen Virol 73:1131-1141. 

Chomczynski P, Sacchi N (1987) Single step me thod of RNA isolation by acid guanidinium 
thiocyanate-phenol-chloroform extraction. Anal Biochem 162:156-159, 

Choo Q, Richman K, Han J, Berger K, Lee C, Dong C, Gallegos C, CoitD, Medina-Selby A, 
Bair P, Weiner A, Bradley D, Kuo G, Houghton M (1991) Genetic organization and diversity 
of the hepatitis C virus. Proc Natl Acad Sci USA 88:2451-2455. 



Compton J (1991). Nucleic'acid sequence-based amplification. Nature, 350: 91- 



92. 



BNSDOCID: <WO_942S601A2_L> 



SUBSTITUTE SHEET (RULE 26) 



WO 94/25601 PCT/EP94/01323 

77 

Duchosal A, Eming S, Fisher P (1992) Immunization of hU-PBL-SCID mice and the resue of 

human monoclonal Fab fragments through combinatorial libraries. Nature 355:258-262. 

i 

Duck P (1990). Probe amplifier system based on chimeric cycling oligonucleotides, 
j Biotechniques 9, 142-J47. ' , 

Guatelli J, Whitfield K, Kwoh D, Barringer K, Richman D, Gengeras T (1990) Isothermal, in 
»■ vitro amplification of nucleic acids by a multienzyme reaction modeled t after retroviral 
i replication. Proc Natl Acad Sci USA 87: 1874-1878. , 

Hijikata M, Kato N, Ootsuyama Y, Nakagawa M, Shimotohmo K (1991) Gene mapping of 
the pytativd structural region of the hepatitis C virus genome by in vitro processing analysis. 
Proc Natl Acad Sci USA 88, 5547-555 1. ' 

I 

I ■ 

Jacobs K, Rudersdorf R, Neill S, Dougherty J, Brown E, Fritsch E (1988) The thermal 
\ stability 'of oligonucleotide duplexes is sequence independent in tetraalkylammohium salt 
solutions: application to identifying recombinant DNA clones. Nucl Acids Res 16:4637-4650. 

Kato N, Hijikata M, Ootsuyama Y, Nakagawa M, Ohkpshi S, Sugimura T, Shimotohno K 
(1990) Molecular cloning of the human hepatitis C virus genome from Japanese patients with 
non-A, non-B hepatitis. Proc Natl Acad Sci USA 87:9524-9528. 

Kwoh D, Davis G, Whitfield K, Chappelle H, Dimichele L, Gingeras T (1989). Transcription- 
based amplification system and detection of amplified human immunodeficiency virus type 
1 with a bead-based sandwich hybridization format. Proc Natl Acad Sci USA, 86: 1 173-1 177. 

Kwok S, Kellogg D, McKinney N, Spasic D, Goda L, Levenson C, Sinisky J, (1990). Effects 
of primer-template mismatches on the polymerase chain reaction: Human immunodeficiency 
views type 1 model studies. Nucl. Acids Res., 18: 999. 

Landgren U, Kaiser R, Sanders J, Hood L (1988). A ligase-mediated gene detection technique. 
Science 241:1077-1080. 



BNSDOCID: <WO 9425601 A2_L> 



SUBSTITUTE SHEET (RULE 26) 



WO 94/25601 

, PCT/EP94/01323 

78 ' • , 

Lizardi P, Guerra Q Lomeli H, Tussie-Luna I, .Kramer F (1988) Exponential amplification of 
recombinant RNA hybridization probes. Bio/Technology 6:1197-1202'. 

Lomeli H, Tyagi S, Printchard C, Lisardi P, Kramer E (1989) Quantitative assays based on 
the, use of replicatable hybridization probes. Clin Chem 35: 1826-1831. 

Machida A, Obnuma H, Tsuda F, Munekata E, Tanaka T, Akahane Y, Okamoto H,,Mishiro 
S (1992) Hepatology 16, 886-891. ' 

l 

t - 

Maniatis T, Fritscn E, Sambrook J (1982) Molecular cloning: a laboratory manual. Cold 
Spring Harbor Laboratory Press, Gold Spring Harbor, NY. 

Mori S, Katb N, Yagyu A, Tanaka T, Ikeda Y, Petchclai B, Chieivsilp P, Kurimura T, 
Shimotohno K (1992) A new type of hepatitis C virus in patients in Thailand. Biochem 
Biophys Res Comm 183:334-342. ' 



Okamoto H, Okada S, Sugiyama Y, Kurai K, lizuka H Machida' A, Miyakawa Y, Mayumi 
M (1991) Nucleotide sequence of the genomic RNA of hepatitis C virus isolated from a 
human carrier: comparison with reported isolates for conserved and divergent regions. J Gen 
Virol 72:2697-2704'. 

Okamoto H, Kurai K, Okada S, Yamamoto K, Lizuka H, Tanaka T, Fukuda S, Tsuda F, 
Mishiro S (1992) Full-length sequences of a hepatitis C virus genome having poor homology 
to reported isolates: comparative study of four distinct genotypes. Virology 188:331-341. 

Persson M, Caothien R, Burton D (1991). Generation of diverse high-affinity human 
monoclonal antibodies by repertoire Zoning. Proc Natl Acad Sci USA 89:2432-2436. 
Saiki R, Gelfand D, Stoffel S, Scharf S, Higuchi R, Horn G, Mullis K, Erlich H (1988). 
Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. 
Science 239:487-491. 

Saiki R, Walsh P, Levenson C, Erlich H (1989) Genetic analysis of amplified DNA with 
immobilized sequence-specific oligonucleotide probes (1989) Proc Natl Acad Sci USA 



SUBSTITUTE SHEET (RULE 26) 



WO 94/25601 , PCT7EP94/01323 

79 ' 
86:6230-6234, • • "' 

Sano T,i Smith C, Cantor C (1992) Immuno-PCR: very sensitive antigen detection by means 
of specific antibody-DNA conjugates. Science 258:120-122. 

Simmoilds P/McOmsh F, Yap P, .Chan S, Li^ C, Dusheiko G, Saeed A, Holmes E (1993), 
Sequence variability in the 5' non-coding region of hepatitis C virus : identification pf a new 
virus type and restrictions on sequence diversity. J Gen Virology, 74:661-668. 

Stuyver L, RossauR, Wyseur A, Diihamel M, Vanderborght B, Van Heuverswyn H, Maertens 
G (1993) Typing of hepatitis* C virus'(HCV) isolates and characterization of new (sub)types 

using a Line Probe Assay.- J Gen Virology, 74: / 1093-1 102. 

' ' ' . ' 1 

Walker G, Little M, Nadeau J, Shank D,(1992). Isothermal in vitro amplification of DNA by 
a restriction enzyme/DNA polymerase system. Proc Natl Acad Sci USA 89:392-396. 

Wu D, Wallace B (1989). The ligation amplification reaction (LAk) - amplification of specific 
DNA sequences Using sequential rounds of template-dependent ligation. Genomics 4:560-569., 



BNSDOCID: <WO 9425601 A2_L> 



SUBSTITUTE SHEET (RULE 26) 



WO 94/25601 

PCT/EF94/01323 

80 

CLAIMS , 

1. A composition comprising or consisting of at least one polynucleic acid containing 8 or 
more cont lg uous nucleotides selected from at least one of the following HCV sequences- 

( , - an HCV type .3 genomic sequence, more particularly in any of the following 

I 'regions: ' 

■ , - the region spanning position* 417 to 957 of the Core/El region of HCV subtype 

• 3a, 



, the region spanning positions 4664 to 4730 of the NS3 region of HCV type 3 
- the region spanning' positions 4892 to 5292 of the NS3/4 region of HCV type 

■ - ' the region spanning positions 8023 to 8235 of the NS5 region of HCV subtype 

an HCV subtype 3 c, genomic sequence, • 
an HCV subtype 2d genomic sequence, • ■ 

an .H CV type 4 genomic sequence, 
- the coding region of' HCV subtype 5a, 

with said nucleotide numbering being with respect to the numbering of HCV nucleic acids as 
shown m Table 1, and with said polynucleic acids containing at least one nucleotide difference 

Wlth ^ HCV " ol >™ cleic ™* sequences * the above-indicated regions or the 
complement thereof. 

2. A composition according to claim 1, wherein said polynucleic acids correspond to a 
nucleotide sequence selected from any of the following HCV genomic sequences- 

an HCV genomic sequence as having a homology of at least 67%, preferably more than 
690/c, most preferably 71% or more to any of the sequences as represented in SEQ ID NO 
13, 15, 17, 19, 21, 23, 25 or 27 in the region spanning positions 417 to 957 of the 
Core/El region; 

- an HCV genomic sequence as having a homology of at least 65%, preferably more than 
67o/o, most preferably 69% or more to any of the sequences as represented in SEQ ID NO 
19, 21, 23, 25 or 27 in the region spanning positions 574 to 957 of the El region 
an HCV genomic sequence, having a homology of at least 7 9 o/o, more preferably at least 
SI"/,, most preferably more than 83% or more to any of the sequences as represented in 
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SEQ ID NO 147 in the region spanning positions 1 to 378 of the Core region ; 
an HCV genomic sequence having a homology of at least 74%, more preferably at least 
•76%, most preferably more than 78% or more to any of the sequences as represented in 
SEQ ID NO 13, 15, 17, 19, 21, 23, 25 qr 27, in theregion spanning positions 417 to 957 
, in the Core/El region; , 1 , 

an HCV genomic sequence having a homology of at least 74%, preferably more than 
76%, most preferably 78% or more to any of the sequences as represented in SEQ ID NO 
13, 15, 17, 19, 21, 23, 25 or 27 in the region spanning positions 574 to 957 in the El 
, region; . 

an HCV genomic sequence having a homology of more than 73 .5%, preferably more than 
74%, most preferably 75^o homology to any of the sequence as represented in SEQ ED 
NO 29 in the region spanning positions 4664 to 4730 of the NS3 region; 
an HCV genomic sequence having a homology of more than 70%,' preferably more than 
72%, most preferably more than 74f% homology to any of the sequences as represented 
in SEQ ID NO 29, 31, 33, 35, 37 or 39 in the region spanning positions 4^92 to 5292 
in the NS^/NS4 region; 
- an HCV genomic sequence having a hoiriology of more than 95%, preferably 95,5%, 
most preferably 96% homology to any of the sequences as represented in SEQ ID NO 5, 

7, 1, 3, 9 or 11 in the region spanning positions 8023 to 8235 of the NS 5 region; 

1 i 

an HCV genomic sequence of the BR36 subgroup of HCV type 3a having a homology 
of more than 96%, preferably 96.5%, most preferably 97% homology to any of the 
sequences as represented in SEQ ID NO 5, 7, 1, 3, 9 or 11 in the region spanning 
positions 8023 to 8192 of the NS5B region; 

an HCV genomic sequence having a homology of more than 79%, more preferably more 
than 81%, and most preferably more than 83% to the sequence as represented in SEQ ED 
NO 149 in the region spanning positions 7932 to 8271 in the NS5B region. 

3. A composition according to claim 1, wherein said polynucleic acids correspond to a 
nucleotide sequence selected from any of the following HCV genomic sequences: 

an HCV genomic sequence having a homology of more than 85%, preferably more than 
86%, most preferably more than 87% homology to any of the sequences as represented 
in SEQ ED NO 41, 43; 45, 47, 49, 51, 53 or 151 in the region spanning positions 1 to 
573 of the Core region; 
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83%, most preferably more than 85% homology to the sequerice as represented in SEQ 
ID NO 189 in the region spanning positions 379 to 957 of the El region; 
an HCV genomic sequenpe having a homology of more than 85%, preferably more than 
87%, most preferably more than 89% homology to any of the sequences as represented 
in SEQ ID NO 16,7 or 169 in the region spanning positions 379 to 957 of the El region; 

- an HCV genomic sequence having a homology of more than 79%, preferably more than 
81%, most preferably more than 83% homology to any of the sequences as represented 
in SEQ ID NO 171 or 173 in the region spanning positions 379 to 957 of the El region; 
an HCV genomic sequence having a homology of more than 84%, preferably more than 
86%;, most preferably more than &8% homology to thle sequence as represented in SEQ 
ID NO 175 in the region spanning positions 379 to 957 of the El" region; 

aji HCV genomic sequence having a homology of more than 83%, preferably more than 
85%, most preferably more than 87% homology to' the siequence as represented in SEQ 

ID NO 177 in the region spanning positions 379 to 957 of the El region ; 

■ i • 

- an HCV genomic sequence having a homology of more than 76%, preferably more than 
\ 78%, tnost preferably more than 80% homology to the sequence as represented in SEQ 

ID NO 179 in the region spanning positions 379 to 957 of the El region; 

an HCV genomic sequence having a homology of more than 84%, preferably more than 

86%, most preferably more than 88% homology to the sequence as represented in SEQ 

ID NO 181 in the region spanning positions 379 to 957 of the El region ; 

an HCV genomic sequence having a homology of more than 73%, preferably more than 

75%, most preferably more than 77% homology to any of the sequences as represented 

in SEQ ID NO 106, 108, 110, 112, 1 14, or 1 16 in the region spanning positions 7932 to 

827 i of the NS5 region ; 

an HCV genomic sequence having a homology of more than 88%, preferably more than 
89%, most preferably more than 90% homology to any of the sequences as represented 
in SEQ ID NO 106, 108, 110, or 112 in the region spanning positions 7932 to 8271 of 
the NS5 region; 

an HCV genomic sequence having a homology of more than 88%, preferably more than 
89%, most preferably more than 90% homology to any of the sequences as represented 
in SEQ ID NO 116 or 201 in the region spanning positions 7932 to 8271 of the NS5 
region; 

an HCV genomic sequence having a homology of more than 87%, preferably more than 
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89%, most preferably more than 90% homology to the sequence as represented in SEQ 
ID NO 203 m the region spanning positions 7932 to 8271 of the.NS5 region 

- an HCV genomic sequence having a homology of more than 85%, preferably more than 
87%, most preferably more than 89% homology to the sequence as represented in SEQ 

( ID NO 1 14 m the region spanning positions 7932 to 8271 of the NS5 region; 

- an HCV genomic sequence having a homology of more than' 86% preferably more than 
87%, most preferably more than 88% homology to the sequence as represented in SEQ ■ 
ID NO 207 in the region spanning positions 7932 to 8271 of the NS5 region; ' 

- an HCV genomic sequence having a homology of more than 84%, preferably more than 
86«/o, most preferably more than 88% homology to the sequence as represented in SEQ 
ID NO 209 in the region spanning positions 7932 to 8271 of the NS5 region- 

- an HCV genomic sequence having a homology of more than 81%, preferably more than 
83o/ 0 , most preferabJy ^ ^ g5% hQmoiogy to se ^ ce ^ represented fa 

ID NO 211 in the region spanning positions 7932 to 8271 6f the NS5 region. 

5. ' A composition according to claim 1, wherein said polynucleic acids correspond to a 
nucleot.de sequence selected frOm any of the following HCV genomic sequences: 

an HCV genomic sequence having a homology of more than 78%, preferably more than 
800/., most preferably more than 82% homology to the sequence as represented in SEQ 
ID NO 143 in the region spanning positions 379 to 957 of the Core/El region; 
an HCV genomic sequence having a homology of more than 74% preferably more than 
76% mos t preferably more than 78% homology to the sequence as represented in SEQ 
ID NO 143 in the region spanning positions 574 to 957; 
- an HCV genomic sequence having a homology of more than 87% preferably more than 
89% most preferably more than 91% homology to the sequence as represented in SEQ 
ID NO 145 in the region spanning positions 7932 to 8271 of the NS5B region. 
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A composition according to any of claims 1 to 5, wherein said polynucleic acid is liable 
to act as a primer for amplifying the nucleic acid of a certain isolate belonging to the genotype 
from which the primer is derived. 

7. A composition according to any of claims 1 to 5, wherein said polynucleic acid is able 
to act as a hybridization probe, for specific detection and/or classification into types of a 
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nucleic acid containing said nucleotide sequence,' with said oligonucleotide being possibly 
labelled or attached to a solid substrate. 

8. Use of a composition according to any of, claims 1 to 7 for in vitro detecting the presence 
of ope or more HCV genotypes, more particularly for detecting the presence of a nuclpic acid 
of any of the HCV genotypes having a nucleotide sequence as defined in any of claims 1 to 

5, present in a. biological sample liable to contain them, comprising at least the following 

i 

steps: 

» (i) possibly extracting sample nucleic acid, . 

(ii) possibly 'amplifying the nucleic acid with at least one of the primers according to 
claim 6 or any other HCV type 2, HCV type 3, HCV type 4, HCV type 5 or 
universal HCV "primer, . 1 

(iii) hybridizing thfe nucleic acids of the biological sample, possibly under denatured 

> 1 

conditions, and with said nupleic acids being possibly labelled during or after 
amplification, at appropriate conditions with one or more probes according to claim 
7, wjjth said probes being preferably attached to a solid substrate, 

(iv) washing at appropriate conditions, ;' 

(v) detecting the hybrids formed, 

(yi) inferring the presence pf one or more HCV genotypes present from the observed 
hybridization pattern. 

r* , 

9. A composition consisting ,of or comprising at least one peptide or polypeptide containing 
in its sequence a contiguous sequence of at least 5 amino acids of an HCV polyprotein 
encoded by any of the polynucleic acids according to any of claims 1 to 5. 

10: A composition according to claim 9, wherein said contiguous sequence contains in its 
sequence at least one of the following, amino acid residues: 

L7, Q43, M44, S60, R67, Q70, T71, A79, A87, N106, K115, A127, A190, S130, V134, 
G142, 1144, E152, A157, V158, P165, S177 or Y177, 1178, V180 or E180 or F182, R184, 
1186, H187, T189, A190, S191 or G191, Q192 or L192 or 1192 or V192 or E192, N193 or 
H193 or P193, W194 or Y194, H195, A197 or 1197 or V197 or T197, V202, 1203 or L203, 
Q208, A210, V212, F214, T216, R217 or D217 or E217 or V217, H218 or N218, H219 or 
V219 or L219, L227 or 1227, M23 1 or E23 1 or Q23 1, T232 or D232 or A232 or K232, Q235 
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or 1235, A237 or T237, 1242, 1246, S247, S248, V249, S250 or Y250, ,25 J or V25 1 or M25 1 

orF251 ) D252,T254orV254,L255. 0 rV255, E 256orA256,M258 orF258 orV258 A260 
or Q 260 or S260, A261, T264 or Y264, M265, 1266 or A266, A267, G268 or T268 F271 or 
M271 or V271, 1277, M280 or H280, 1284 or A284 or L84, V274, V291, N292 or S292 
.1 ' M * °' *93 or Y293, Q294 or R2 H L297 or 1297 or Q297, A299 or K299 or Q299 N303 
i br T303, T308 or L308, T310 or F310 or A3 10 or D310 or V310, L313, G317 or Q317 
■'. ,■ L333, S351, A358, A359, A363, S364, A366, T369, L373, F376, Q386, 1387, S392 I39 9 ' 
I ,F402, 1403, R405, D454, A461, A463, T464, K484, Q500, E501, S521, K522, H524 N52 8 ' 
. S531, S532„ V534, F536, F537, M53?, 1546, CI 282, A1283, H131.0, V1312, Q1321 P136 8 ' • 
V1372, V1373, K1405, Q1406, S1409, A1424, A1429, C1435, S1436, S1456, H1496 A150 4 ' 
D1510, D1529, 11543, N1567, D1556, N1567, M1572, Q1579, L1581; S1583, W^i 
E1606 or TJ606, M1611, V1612 or L1612, P1630, C1636, P165,, T1656 or 11656,' L^' 
V1667, V1677, A1681, H1 685, E1687, G1689, V1695, A1700, Q1704, Y1705, A1713' A171 4 ' 
or S17.14, M1718, D1719, A1721 or T1721, R1722, A1723 or V1723, H1726 or G1726 
E1730, V1732, F1735, 11736, S1737, R1738, T1739, G1740, Q1741, K1742, ' Q 1743 AIT* 
» T1745,U746, E1747 or K17 4 7, 11749, A1750, T.1751 or A1751, V1753', mi55 K175 6 ' 
A1757, P1758, A1759, H1762, T1763, Y1764, P2645, A2647, K2650, K2653 or L265 3 ' 
S2664, N2673, F2680, K2681, L2686, H2692, Q2695 or L2695 or 12695 V2712 F271 5 ' 
V2719 or Q2719, T2722, T2724, S2725, R2726, G2729, Y2735, H2739, 12748 G2746 or 
12746, 12748, P2752 or K2752, P2754 or T2754, T2757 or P2757, 

with said notation being composed of a letter representing the amino acid residue by its one- 
letter code, and a number representing the amino acid numbering according to Kato et al 
1990 as shown in Table 1. 

11. A composition according to any of claims 9 or 10, wherein said contiguous sequence 
is selected from any of the following HCV amino acid sequences: 

- a sequence having a homology of more than 72%, preferably more than 74%, and most 
preferably more than 77% homology to any of the amino acid sequences as represented in 
SEQ ID NO 14, 16, 18, 20, 22, 24, 26 or 28 in the region spanning positions 140 to 319 
in the Core/El region; 

- a sequence having a homology of more than 70%, preferably more than 72%, and most 
preferably more than 75% homology to any of the amino acid sequences as represented in 
SEQ ID NO 14, 16, 18, 20, 22, 24, 26 or 28 in the El region spanning positions 192 to 
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- a sequence having a homology of more than' 86%, preferably more than 88%, and most 
preferably more than 90% homology to the amino acid sequences as represented in SEQ 
ID NO 148 in the region spanning positions 1 to 1 10 in the Core region; 

. a sequence having a homology of more than 76%, preferably more than 78%, most 

i 

* preferably more than 80% to any of the amino acid sequences as represented in 'SEQ ID 
NO 30, 32, 34, 36, 38 or 40in the region spanning positions 1646 to 1764 in the NS3/NS4 
1 region; 

- a sequence having.a homology of more than 81.5%, preferably more than 83%, and most 
preferably more than 86% homology to any of the amino acid sequences as represented in 
SEQ ID NO 14, 16, 18, 20, 22, 24, 26 or 28 in the El region spanning positions 192 to 
319; 1 * 

- a sequence having a homology of -more than 86%, preferably more than 88%, most 
preferably more than 90% to the amino acid sequence as represented in SEQ ID NO 150 
in the region spanning positions 2645 to 2757 in the NS5B region; 

1 * ■ V • 

. \ ' ■ . . 

12. A composition according to any of claims 9 or 10, wherein said contiguous sequence 
is selected from any of the following HCV amino acid sequences: 

- a sequence having a homology of more than 80%, preferably more than 82%, most 
preferably more than 84% homology to any of the amino acid sequences as represented in 
SEQ ID NO 118, 120, and 122 in the region spanning positions 127 to 319, 

- a sequence having a homology of more than 73%, preferably more than 75%, most 
preferably more than 78% homology in the El region spanning positions 192 to 319 to any 
of the amino acid sequences as represented in SEQ ID NO 118, 120, and 122, in the region 
spanning positions 127 to 319, 

- a sequence having more than 85%, preferably more than 86%, most preferably more than 
87% homology to any of the amino acid sequences as represented in SEQ ID NO 118, 120 
or 122, in the region spanning positions 192 to 319. 

13. A composition according to any of claims 9 or 10, wherein said contiguous sequence 
is selected from any of the following HCV amino acid sequences: 

• a sequence having more than 93%, preferably more than 94%, most preferably more than 
95% homology in the region spanning Core positions 1 to 191 to any of the amino acid 
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sequences as represented in SEQ ID NO 42, 44, 46, 48, 50, 52, '54, or 152; 

- a sequence having more than 73%, preferably more than 74%, most preferably more than 
76% homology in the region spanning El positions 192 to 319; to any of the amino acid 
sequences as represented in SEQ ID NO 42, 44, 46, 48, 50, 52, 54, 154 or 156; 

- a sequence spanning positions 1286 to 1403 of the NS3 region,, with said sequence being 
characterized as having more man 90%, preferably more than 91%, most preferably more 
than 92% homology to any of the amino acid sequences represented in SEQ ID NO 56 to 

- a sequence spanning positions' 1646 to 1764 of the NS3/4 region, with said sequence being 
characterized as .having more than 66%, more particularly 68%, most particularly 70% or 
more homology to any of the amino acid sequences as represented in SEQ ID NO 60 or 

62. 

) 1 

I 1 

14. A composition according to any of claims 9 to 10, wherein said contiguous sequence is 
selected from any of the following HCV amino acid sequences: 

- a sequence having a more than 83%, preferably more than 85%, most preferably more 
than '87% homology in the region spanning Core positions 1 to 319 to the amino acid 
sequence as represented in SEQ ID NO 144; 

- a sequence having a more than. 79%, preferably more than 81%, most preferably more 
than 84% homology in the region' spanning El positions 192 to 319 to the amino acid 
sequence as represented in SEQ ID NO 144; 

- a sequence having more than 95%, more particularly 96%, most particularly 97% or more 
homology to the amino acid sequence as represented in SEQ ID NO 146,' in the region 
spanning positions 2645 to 2757 of the NS5B region. 

15 ; A composition according to any.of claims 9 to 14, wherein said sequence is selected from 
the following peptides: 

QPTGRSWGQ (SEQ ID NO 93) f 

RSEGRTSWAQ (SEQ ID NO 220) 

RTEGRTSWAQ (SEQ ED NO 221) 

SRRQPDPRARRTEGRSWAQ (SEQ ID NO 268) 

LEWRNTSGLYVL (SEQ ID NO 83) 

VNYRNASGIYHI (SEQ ID NO 126) 
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QHYRNISGIYHV (SEQ ID NO 127) • 

EHYRNASGIYHI (SEQ ID NO 128) 

IHYRNASGIYHI (SEQ IP NO 224) 

VPYRNASGIYHV (SEQ ID NO 84) 
^ • VNYRNASGIYHL(SEQ DE) NO 225) , 
' ■ VNYRNASGVYM (SEQ ID NO 226) • 

VNYHNTSGIYHL (SEQ ID NO 227) 
• QHYRNASGIYHV (SEQ ID NO 228) 

QHYRNVSGIYHV (SEQ ID NO 229) 

IHYRNASDGYYI (SEQ ID NO 230) ' 

LQVKNTSSSYMV (SEQ ID NO 231) 

VYEADDVILHT (SEQ ID NO 85) • 

VYETEHHILHL (SEQ ID NO 129) 

VYEADHHIMHL (SEQ ID NO 130) 

VYETDHHILHL (SEQ ID NO 131) 

VYEADNLILHA (SEQ. ID NQ 86) 

VWQLRAIVLHV (SEQ ID NO 232) 

VYEADYHILHL (SEQ ID NO 233) 

VYETDNHILHL (SEQ ID NO 234) , 

VYETENHILHL (SEQ ID NO 235) 

VFETVHHILHL (SEQ ID NO 236) 

VFETEHHILHL (SEQ ID NO 237) 

VFETDHHIMHL (SEQ ED NO 238) 

VYETENHILHL (SEQ ID NO 239) 

VYEADALILHA (SEQ ID NO 240) 

VQDGNTSTCWTPV (SEQ ID NO 87) 

VQDGNTSACWTPV (SEQ ID NO 241) 

VRVGNQSRCWVAL (SEQ ID NO 132) 

VRTGNTSRCWVPL (SEQ ID NO 133) 

VRAGNVSRCWTPV (SEQ ID NO 134) 

EEKGNISRCWIPV (SEQ ID NO 242) 

VKTGNQSRCWVAL (SEQ ID NO 243) 

VRTGNQ SRCWVAL (SEQ JD NO 244) 
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VKTGNQSRCWIAL (SEQ ID NO 245) , , ' 
VKTGNVSRCWIPL (SEQID NO 247) 1 
VKTGNVSRCWISL (SEQ ID NO 248) . 
VRKDNVSRCWVQI (SEQiID NO 249) 
YRYVGATTAS (SEQ ID NO 89) 
APYIGAPLES (SEQ ID NO 13,5) 
APYVGAPLES (SEQ ID NO 136) 
AVSMDAPLES (SEQ ID NO 137) 
, APSLGAVTAP (SEQ ID NO 90) 
APSFGAVTAP'(SEQ ID, NO 250) 
VSQPGALTKG (SEQ ID NO 251) 
VKYVGATTAS (SEQiID NO 252) 

APYIGAPVES (SEQ. ID NO 253) •< 

AQHLNAPLES (SEQ ID NO 254) ' ' 

SPYVGAPLEP (SEQ ID NO 255) ' . . 

SPYAGAPLEP (SEQ ID NO 256) 

APYLGAPLEP (SEQ ID NO 257) .. 
APYLGAPLES (SEQ ID NO 258) 
APYVGAPLES (SEQ ID NO 259) 
VPYLGAPLTS (SEQ ID NO 260) 
APHLRAPLSS (SEQ ID NQ 261) 
APYLGAPLTS (SEQ ID NO 262) 
RPRRHQTVQT (SEQ ID NO 91) 
QPRRHWTTQD (SEQ ID NO 138) 
RPRRHWTTQD (SEQ ID NO 139) 
RPRQHATVQN (SEQ ID NO 92) 
RPRQHATVQD (SEQ ID NO 263) 
SPQHHKFVQD (SEQ ID NO 264) 
RPRRLWTTQE (SEQ ID NO 265) 
PPRIHETTQD (SEQ ED NO 266) 
TISYANGSGPSDDK (SEQ ID NO 267) 

16. Recombinant vector, particularly for cloning and/or expression, with said recombinant 
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vector comprising a vector sequence, an appropriate prokaryotic, eukaryotic or viral promoter 
sequence followed by the nucleotide sequences as defined in claims 1 to 5, with said 
recombinant vector allowing the expression of any one of the HCV type 2 and/or HCV type 

3 and/or type 4 and/or type 5 dferived polypeptides according to any of claims 9 to 15 in a 

• ■ • . . 

prokaryotic, or eukaryotic host, or in living marpmals when injected as naked DNA, apd more 

particularly a recombinant vector allowing the expression of any of the following HCV type 

2, HCV type 3, type 4 or type 5 polypeptides spanning the following amino acid positions: 

» 1 

- a polypeptide starting at position 1 and ending at any position in the region between 

, positions 70 and 326, more particularly a polypeptide spanning positions 1 to 70, 1 to 85, 
positions 1 to 12b, positions 1 to 150, positions 1 to 191, positions 1 to 200, for expression 
of the Core protein, and positions I to 263, positions 1 to 326, for expression of the Core 
and El protein; . i • 1 
• a polypeptide 1 starting* at any position in the region between positiohs 117 and 192, and 
ending at any t position in the region between positions 263 and 326, more particularly from 
positions 119 tb 32(5, for expression of El, or fonps that have the putativfe membrane 
anchor deleted (positions 264 to 293 plus or minus 8 amino acids); 

- a polypeptide starting at any position in the fegion between positions 1556 and 1688, and 
ending at any position in the region between positions 1739 and 1764, for expression of 
the NS4 regions, more particularly, a polypeptide starting at position 1658 and ending at 
position. 1711 for expression of the NS4a antigen, and more particularly, a polypeptide 
starting at position 1712 and .ending between positions 1743 and 1972, for example 1712- 
1743, 1712-1764, 1712-1782, 1712-1972, 1712 to 1782 and 1902 to 1972 for expression 
of the NS4b protein or parts thereof. 

17. A composition according to any of claims 9 to 15, wherein said polypeptide is a 
recombinant polypeptide expressed by means of an expression vector as defined in claim 16. 

18. A composition according to any of claims 9 to 15 or 16, for use in a method for 
immunizing a mammal, preferably humans, against HCV comprising administratering a 
sufficient amount of the composition possibly accompanied by pharmaceutically acceptable 
adjuvants, to produce an immune response, more particularly a vaccine composition including 
HCV type 3 polypeptides derived from the El, Core, or NS4 region and/or type 4 and/or type 
5 and/or type 2 polypeptides. 
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19. Antibody raised upon immunization with a composition according to any of claims * to 
15, 17 or 18, by means of a process according to claim 18, with said antibody being reactive 
with any of the polypeptides as defined in any of claims 9 to 15, 17 or 18. 

20. Process for detecting in, vitro HCV present in biological sample liable to contain it, 
• comprising at least the following steps: 

(i) contacting the biological sample to be analyzed for the presence of HCV antibodies 
witH any of the compositions according to claims 9 to 15, 17 or 18, preferentially 
in an immobilized form under appropriate conditions which allow the formation 
of an immune complex,, wherein said polypeptide is preferentially in the form of 
a biotinylated polypeptide and is covalently bound to a solid substrate by means ' 
of streptavidin or avidin complexes, 

(ii) removing unbound components, 

(iii) incubating the immunecomplexes formed with heterologous antibodies, which 
specifically bind to the antibodies present in the sample to be analyzed, with said 

, heterologous antibodies having conjugated to a detectable label under appropriate 
cpnditions, 

(iv) detecting the presence of said immunecomplexes visually or by means of 
densitometry and inferring the HCV serotype(s) present from the observed 
hybridization pattern. 

21. Use of a composition according to any of claims 9 to 15, 17 or 18, for incorporation into 
a serotyping assay for detecting one or more serological types of HCV present in a.biological 
sample liable to contain it, more particularly for detecting El and NS4 antigens or antibodies 
of the different types to be detected combined in one assay format, comprising at least the 
following steps: 

(i) contacting the biological sample to be analyzed for the presence of HCV antibodies 
or antigens of one or more serological types, with at least one of the compositions 
according to claims 9 to 15, 17 or 18 in an immobilized form under appropriate 
conditions which allow the formation of an immunecomplex, (wherein said 
polypeptide is preferentially in the form of a biotinylated polypeptide and is 
covalently bound to a solid substrate by means of streptavidin or avidin 
complexes), 
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(ii) removing unbound components, 1 * 

(iii) incubating the immunecomplexes formed with heterologous antibodies, which 
specifically bind tq the antibodies present in the sample to be analyzed, with said 
heterologous antibodies having conjugated to a detectable label under appropriate 
conditions, , . 

(iv) detecting the presence of said immunecomplexes visually or by means of 

densitometry and inferring the HCV serological types present from the observed 

i 

' binding pattern. 

22. A kit for determining the presence of HCV genotypes as defined in any of claims 1 to 5 
present in a biological sample liable to contain thlem, comprising:, 

r possibly at least one primer composition containing any primer selected from those 
defined in claim 6 or any other HCV type 2 and/or HCV type 3 and/or HCV type 4 
ind/or HCV type 5, or universal HCV primers, , 

- at least one probe composition according to claim 7, preferably in combination with 
other polypeptides or peptides from HCV type' 1, type 2 or other types of HCV, with 
said probes being preferentially immobilized oh a solid substrate, and more 
preferentially on one and the same membrane strip, 

- a buffer or components necessary for producing, the buffer enabling hybridization 
reaction between these probes and the possibly amplified products to be carried out, 

. - a means for detecting the hybrids resulting from the preceding hybriziation, 

- possibly also including an automated scanning and interpretation device for infering 
the HCV genotype(s) present in the sample from the observed hybridization pattern. 

23. A kit for determining the presence of HCV antibodies according to any of claims 9 to 15, 
17 or 18 present in a biological sample liable to contain them, comprising: 

- at least one polypeptide composition according to any of claims 9 to 15, 17 or 18, 
with said polypeptides being preferentially immobilized on a solid substrate, and more 
preferentially on one and the same membrane strip, 

- a buffer or components necessary for producing the buffer enabling binding reaction 
between these polypeptides and the antibodies against HCV present in the biological 
sample, 

- a means for detecting the immune complexes formed in the preceding binding 



SUBSTITUTE SHEET (RULE 26) 



WO 94/25601 

PCT/EP94/01323 

94 ' 

reaction, " , 

- possibly also including ah automated scanning and interpretation device for infering * 
( the HCV genotype present in the sample from the observed binding pattern. 
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SEQUENCE LISTING ' 

t 1 

(1) GENERAL INFORMATION: . 

» 

i 

(i) APPLICANT: 

■ (A) NAME: Innogenetics sa. , i • ' 

(B) STREET: Industriepark Zwijnaarde 7, box 4 
• (C) CITY? Ghent , i . » 

, (E) COUNTRY: Belgium 

(F) POSTAL CODE (ZIP): B-9052 1 

(G) TELEPHONE: 00 32 9 241 07 11 ( . 
<H) TELEFAX: 00 j32 9 241 07 99 

(ii) TITLE OF INVENTION: New sequences of hepatitis C virus genotypes 
1 for diagnosis, prophylaxis arid therapy. 

i ■ 

(iii) NUMBER OF SEQUENCES: 270 ( 



(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE : Floppy disk • 

(B) , COMPUTER: IBM PC ■ compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) .SOFTWARE: Patentln Release #1.0, Version #1.25 (EPO) 



(2)' INFORMATION FOR SEQ ID NO: 1: 

i 1 

t i 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 213 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO " 
(iii) ANTI- SENSE:. NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: BR34-4-20 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1. .213 ; 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

CTC ACG GAA CGG CTT TAC TGC GGG GGC CCT ATG TTC AAC AGC AAG GGG 48 
Leu Thr Glu Arg Leu Tyr Cys Gly^Gly Pro Met Phe Asn Ser Lys Gly 
15 10 15 

GCC CAG TGT GGT TAT CGC CGC TGC CGT GCC AGT GGA GTT CTG CCT ACC 96 
Ala Gin Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Pro Thr 
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20 25 3 0 



"1 



I 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 71 amino acids 

(B) TYPE : amino acid 
(D) TOPOLOGY: linear 



i 



(ii) MOLECULE TYPE: protein 
(xi), SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
Leu Thr Glu Arg Leu Tyr Cys Gly Gly Pro Met Phe Asn Ser Lys Gly 

Ala Gin C^s Gly Tyr Ar^ Arg Cys Arg Ala Ser Gly Val Leu Pro Thr 
20 25 30 

Ser Phe Gly Asn Thr lie Thr Cys Tyr He Lys Ala Thr Ala Ala Ala 
35 40 45 

Arg Ala Ala Gly Leu Arg Asn Pro Asp Phe Leu Val Cys Gly Asp Asp 

Leu Val Val Val Ala Glu Ser 
65 70 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 213 base pairs 

(B) TYPE: nucleic acid 

(C) S TRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: BR36-23-18 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
<B) LOCATION: 1. .213 



144 



192 



AGC TTC GGC AAC ACA ATC ACT TGC TAC ATC AAG ■ GCC ' ACA GCC* GCT GCA 
Ser Phe Gly Asn Thr He Thr Cys . Tyr He Lys Ala Thr Ala Ala Ala 
35 ,40 45 

AGG GCC GCA GGC CTC CGG* AAC CCG GAC TTT CTT GTC TGC GGA GAT GAT 
Arg Ala Ala Gly Leu Arg Asn Pro Asp Phe Leu' Val Cys Gly Asp Asn 

50 ,55 60' • • 

» 

CTG GTC GtG GTG GCT GAG AGT ' 

Leu Val Val Val Ala Glu Ser ' . 213 

65 . , 70 1 

t 

(2) INFORMATION FOR SEQ ID NO: 2: ' 
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(xi) .SEQUENCE 'DESCRIPTION: SEQ ID Nd: 3: 



CTC ACG GAA CGG CTT TAC TGC GGG GGC CCT ATG TTC AAC AGC AAG GGG 48 

Leu T)ir C^lu Arg Leu Tyr Cys Gly Gly Pro Met Phe A'sn Ser Lys Gly 

1 5 . 10 15 

, I ' . 

GCC CXG TGT GGT TAT CGC CGC TGC CGT GCC AGT GGA GTT CTG ( CCT ACC 96 

Ala Gin Cys Gly Tyr* Arg Arg £ys Arg Ala i Ser Gly Val Leu Pro Thr • 

20 25 30 

AGC TTC GGC AAC ACA ATC ACT TGC TAC ATC AAG GCC ACA GCG GCT GCA 1,44 
Ser Phe Gly Asn Tnr lie Thr, Cys Tyr lie Lys Ala Thr Ala Ala Ala 
35 ' 40 45 

AGG GCC GCA GGC CTC CGG AAC CCG GAC TTT CTT GTC TGC GGA GAT GAT ,192 

Arg Ala Ala Gly Leu Arg Asn* Pro- Asp Phe Leu Val Cys Gly Asp Asp 

50 '55 ( 60 

'.»■■ 

CTG GTC GTG GTG GCT GAG AGT 213 

Leu Val Val Val Ala Glu» Ser' • 

65 "... 70 • i 

(2) INFORMATION ' FOR SEQ ID NO: 4: ' , 

SEQUENCE CHARACTERISTICS: 

(A) » LENGTH: 71 amino acids 

(B) TYI^E : amino adid ( , • 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi)' SEQUENCE PESCRIPTION: SEQ ID NO: 4: 

Leu Thr Glu Arg Leu Tyr Cys Gly Gly Pro Met Phe Asn Ser Lys Gly 
1 5 "10 15 

Ala Gin Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Pro Thr 
20 25 30 

Ser Phe Gly Asn Thr lie Thr Cys Tyr lie Lys Ala Thr Ala Ala Ala 
35 40 45 

Arg Ala Ala Gly Leu Arg Asn Pro' Asp Phe Leu Val Cys Gly Asp Asp 
50 55 60 

Leu Val Val Val Ala Glu Ser 
65 70 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS": 

(A) LENGTH: 213 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA •. 
(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO ' 

* i 

(vii) IMMEDIATE SOURCE: " 
j ' ,(B) CLONE; BR36-23-18 , 

i ! ' ' . 

(ix) FEATURE: ■ 

(A) NAME/KEY: CliS 

(B) LOCATION: 1. .213 

(Xi) SEQUENCE DESCRIPTION: dEQ ID NO: 5: • 



10 15 

S G?n nf CGT TGC CGT GCC AGT G <* GTT CTg'cCT ACC 

Ala Gin Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu PrI 

AGO TTC GGC AAC ACA ATC ACT TGT TAC ATC AAA GCC ACA GCG GCC GCA ' 
Ser Phe Gly Asn Thr He Thr Cys Tyr He Lys Ala Thr 2a Sa £J> 
, , 40 ' 45 

Ts til Su ST f C f" m CTT GTC TOC GGA GAT GAT 

ys Ala Ala Gly Leu Arg Ser Pro Asp Phe Leu Val Cys Gly Asp Asp 

CTG GTC GTG GTG GCT GAG AGT 

Leu Val Val Val Ala Glu Ser ' 213 

65 70 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 71 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
Leu Thr Glu Arg Leu Tyr Cys Gly Gly Pro Met Phe Asn Ser Lys Gly 



5 10 



15 



Ala Gin Cys Gly iyr Arg Arg Cys Arg Ala Ser Gly Val Leu Pro Thr 
20 25 30 

Ser Phe Gly Asn Thr He Thr Cys Tyr He Lys Ala Thr Ala Ala Ala 
" 40 



48 



•96 



144 



192 



45 
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" Lys Ala Ala Gly Leu Arg Ser Pro Asp Phe Leu Val Cys Gly Asp Asp 1 
50 55 '60 

Leu Val Val Val Ala Glu Ser 
65 70 . 

1 i 

(2) INFORMATION FOR SEQ ID NO: 7: 

i 

(i) SEQUENCE CHARACTERISTICS: 
| ' (f) LENGTH: 213 base pairs i 

\ (B) TYPE: nucleic acid , 

(C) STRANDEDNESS : single . . 

j CD) - TOPOLOGY: linear 

' (ii) MOLECULE TYPE: CDNA . 

1 ( i i i ) HYPOTHETICAL : NO 1 ' 

■ . ( » . 

(iii)' ANTI-SENSE: NO, 

' • * 

(vii) IMMEDIATE SOURCE: 1 • 

(B) CLONE: BR36-23-2>0 

i 

(ix), FEATURE: 

■ (A) NAME/KEY: CDS . 
(B) LOCATION: 1. .2 13 



(xi) ^EJQUENCE DESCRIPTION: SEQ ID NO: 7: 



CTC ACG GAG CGG CTT TAC TGC GGG GGC CCT ATG TTT AAC AGC AAA GGG 48 
Leu Thr Glu Arg Leu Tyr Cys Gly Gly Pro Met Phe Asn Ser Lys Gly 
.1 5 10 15 

GCC CAG TGT GGT TAT CGC CGT TGC CGT GCC AGT GGA GTT CTG CCT ACC 96 
Ala Gin Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Pro Thr 
20 25 30 

AGC TTC GGC AAC ACA ATC ACT TGT TAC ATC AAA GCC ACA GCG GCC GCA 144 
Ser Phe Gly Asn Thr lie Thr Cys Tyr lie Lys Ala Thr Ala Ala Ala 
,35 40 45 

AAA GCC GCA GGC CTC CGG AGC CCG GAC TTT CTT GTC TGC GGA GAT GAT 192 
Lys Ala Ala Gly Leu Arg Ser Pro Asp Phe Leu Val Cys Gly Asp Asp 
50 55 60 

CTG GTC GTG GTG GCT GAG AGT 213 
Leu Val Val Val Ala Glu Ser 
65 70 



(2) INFORMATION FOR SEQ ID NO : 8: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 71 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(Xi) '' SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Leu Thr Glu Arg Leu Tyr Cys Gly Gly Pro Met Phe Asn Ser Lys Gly 
1 ' ' 5 10 15 

t 

Ala Gin Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Pro Thr 

♦ 20 • ( . 25 . . . 36 

Ser Phe Gly Asn Thr He ,Thr C^s Tyr lie) Lys Ala Thr Ala Ala Ala 
35 40 45 

; i 

Lys Ala Ala Gly Leu Arg Ser Pro Asp Phe Leu Val Cys Gly Asp Asp 
50 55 60 

Leu Val Val Val Ala Glu Ser * 
65 70 1 

(2) INFORMATION FOR SEQ ID NO: 9: 

. m ; • 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : ( 213 base pairs 

(B) TYPE: nucleic acid 1 ' 
tC) ^TRANDEDNESS : single' 

(D). TOPOLOGY: linear ~ ' 

(ii) MOLECULE TYPE: cDNA ■ 

»■ i 

(iii) HYPOTHETICAL": NO 
(iii) ANTI-SENSE: NO 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: BR33-2-17 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..213 



(Xi) SEQUENpE DESCRIPTION: SEQ ID NO : 9 : 

CTC ACG GAG CGG CTT TAC TGC GGG GGC CCT ATG TTC AAC AGC AAG GGG 48 
Leu Thr Glu Arg Leu Tyr Cys Gly Gly Pro Met Phe Asn Ser Lys Gly 
1 5 10 15 

GCC CAG TGT GGT TAT CGC CGT TGT CGT GCC AGT GGA GTT CTG CCT ACC 96 
Ala Gin Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Pro Thr 
20 25 30 

AGT TTC GGC AAC ACA ATC ACT TGT TAC ATC AAG GCC ACA GCG GCT GCA 144 
Ser Phe Gly Asn Thr He Thr Cys Tyr He Lys Ala Thr Ala Ala Ala 
35 40 45 

AAA GCC GCA GGC CTC CGG AAC CCG GAC TTT CTT GTT TGC GGA GAT GAT 192 
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Lys Ala Ala Gly Leu Arg Asri Pro Asp Phe Leu Val Cys Gly Asp Asp 
50 55 * 60* 

TTG GTC GTG GTG GCT GAG AGT * 213 

Leu Val Val Val Ala Glu Ser 
65 . ,70 ' 

' .«■'■'. 
(2) INFORMATION FOR SEQ Itf NO: 10: 

(}) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 71 1 amino acids 1 < 

(B) TYPE: amino acid , 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE 1 DESCRIPTlQN: ' SEQ ID NO: 10: 

Leu Thr Glu Arg Leu Tyr Cys Gly Gly Pro Met Phe Asn Ser Lys Gly 
1 5 10 15 

'.»••• 

Ala Gin Cys Gly, Tyr Ar<g Arg Cys Arg Ala Ser Gly Val Leu Pro iThr 
20 25 ,30, 

• ., 
Ser Phe Gly Asri Thr He Thr Cys Tyr He Lys Ala Thr Ala Ala Ala 

35 .' ' 40 , 45 

Lys Ala Ala G3,,y I^eu Arg Asn Pro Asp Phe Leu Val Cys Gly Asp Asp 
50 55 . 60 

Leu Val Val Val Ala Glu Ser 
65 70 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 213 base' pairs 

(B) TYPE.: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

MOLECULE TYPE: cDNA 
HYPOTHETICAL: NO 
ANTI- SENSE: NO 



IMMEDIATE SOURCE: 

(B) CLONE: BR33-2-21 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1 . . 213 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



(ii) 
(iii) 
(iii) 

(vii) 



BNSDOCID: <WO 9425601A2J. 



SUBSTITUTE SHEET (RULE 26) 



96 



144 



192 
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il 

CTC ACQ GAG CGG CTT TAC TGC GGG GGC CCT ATG TTC AAC AGC AAG GGG 48 1 

Leu Thr Glu Arg Leu Tyr Cys Gly Gly Pro Met Phe Asn Se'r Lys' Gly 

1 5 10 15 

GCC CAG TGT GGT TAT CGc'cGT TGT CGT GCC AGT GGA GTT CTG CCT ACC 
Ala Gin Cys Gly Tyr Arg' Arg Cys Arg Ala Ser Gly Val Leu Pro Thr 

20 25 3o, 

AGT TTC GGC AAC ACA ATC ACT. TGT TAC ATC AAG GCC ACA GCG GCT GCA 

J Ser Phe Gly Asn Thr lie Thr Cys Tyr lie Lys Ala Thr Ala Ala Ala 

1 35 40 45 

AAA GCC GCA GGC CTC CGG AAC CCG GAC ' TTT CTT GTT TGC GGA GAT GAT 
Lys Ala Ala Gly Leu Arg Asn Pro Asp Phe Leu Val Cys Gly Asp Asd 
' . 50 55 . 60 

t 

TTG GTC GTd GTG GCT GAG AGT 1 , 

Leu Val Val Val Ala Glu Ser 1 '» ' , ' 

65 7 ,° 

i • 

(2) JNFORkATION FOR SEQ ID NO: 12: ' 

(i) SEQUENCE CHARACTERISTICS: ' 

(A) LENGTH: 71 amino acids 

(B) TYPE: amino acid 

(D) TOPOLOGY: linear 1 

i (ii') 1 MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Leu Thr Glu Arg Leu Tyr Cys Gly Gly Pro Met Phe Asn Ser Lys Gly 
1 5 10 is 

Ala Gin Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Pro Thr 
20 25 30 

Ser Phe Gly Asn Thr lie Thr Cys Tyr He Lys Ala Thr Ala Ala Ala 
35 40 45 

Lys Ala Ala Gly Leu Arg Asn Pro Asp Phe Leu Val Cys Gly Asp Asn 
50 55 60 

Leu Val Val Val Ala Glu Ser 
65 70 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 541 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
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i 



(iii) ANTI-SENSE: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: HD}0-2-5 

i 

(ix) FEATURE: i 
(A) NAME /KEY: CDS 
( (£) LOCATION: 2.. 541 ' i 

i 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

I 

C'GTC GGC GCT CCT GTA GGA GGC GTC GCA AGA GCC CTT GCG CAT GGC 46 
Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala . His Gly 
1 ' 5 ' 10 ' 15 

GTG AGG GCC CTT GAA GAC ( GGG ATA AAT TTC GCA ACA GGG AAT TTG CCC 94 
Val Arg Ala iLeu Glu Asp Gly lie Asn Phe Ala Thr Gly Asn Leu Pro 
20 1 25 30 

GGT TGC TCC TTT TCT ATC TTC CTJT CTT GCT CTG TTQ TCT TGC TTA ATC 142 
Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Phe Ser Cys Leu lie 
35 40 '45 

■ i ■ 

CAT CCA GCA GCT AGT CTA GAG tGG CGG AAC ACG TCT GGC CTC TAT GTC 190 
His Pro Ala Ala Ser 1 Leu Glu Trp Arg Asn Thr Spr Gly Leu Tyr Val ' 
50 * 55 60 

i * 
CTT ACC AAC GAC TGT TCC AAT AGC AGT ATT GTG TAT GAG GCC GAT GAC 238 
Leu Thr Asn Asp Cys Ser Asn Ser Ser lie Val Tyr Glu Ala Asp Asp 
65 70 75 

GTT ATT CTG CAC ACA CCC GGC TGT GTA CCT TGT GTT CAG GAC GGT AAT 286 
Val lie Leu His Thr Pro Gly Cys Val Pro Cys Val Gin Asp Gly Asn 
80 85 90 95 

ACA TCT GCG TGC TGG ACC CCA GTG ACA. CCT ACA GTG GCA GTC AGG TAC 334 
Thr Ser Ala Cys Trp Thr Pro Val Thr Pro Thr Val Ala Val Arg Tyr 
100 105 110 

GTC GGA GCA ACC ACC GCT TCG ATA CGC AGG CAT GTA GAC ATG TTG GTG 382 
Val Gly Ala Thr Thr Ala Ser lie Arg Arg His Val Asp Met Leu Val 
115 120 125 

GGC GCG GCC ACG ATG TGC TCT GCT CTC TAC GTG GGT GAT ATG TGT GGG 43 0 

Gly Ala Ala Thr Met Cys Ser Ala Leu Tyr Val Gly Asp Met Cys Gly 
130 135 140 

GCC GTC TTC CTC GTG GGA CAA GCC TTC ACG TTC AGA CCT CGT CGC CAT 478 
Ala Val Phe Leu Val Gly Gin Ala Phe Thr Phe Arg Pro Arg Arg His 
145 150 155 

CAA ACG GTC CAG ACC TGT AAC TGC TCA CTG TAC CCA GGC CAT CTT TCA 526 
Gin Thr Val Gin Thr Cys Asn Cys Ser Leu Tyr Pro Gly His Leu Ser 
160 165 170 175 
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GGA CAC CGA ATG GCT 
Gly His Arg Met Ala 

180 ■, ' 

s i *i 

(2) INFORMATION FOR SEQ ID NO : 14 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18Q amino acidi 1 
, (B) TYPE: .amino acid , 

(D) TOPOLOGY: liriear ' 

Ui) MOLECULE TYPE: protein 1 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 : 

Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val 
1 -5 ic -- 



PCT/EP94/01323 



541 



15 



Arg Ala Leu Glu Asp Gly He Asn, Phe Ala Thr Gly Asn Leu Pro Glv 
20 25 30 

Cys Ser Phe Ser He Phe Leu Leu Ala Leu' Phe Ser Cys Leu He His 
35 1 40 45 • 

Pro Ala Ala S^r Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val Leu 
50 , , 55 go * 

Thr Asn Asp, Cys Ser Asn Ser ,Ser lie Val Tyr Glu Ala Asp Asp Val 
65 * • ' . 70 75 80 

He Leu His Thr Pro Gly Cys Val Pro Cys Val Gin Asp Gly Asn Thr 
85 90 95 

Ser Ala Cys Trp Thr Pro Val TJir Pro Thr Val Ala Val Arg Tyr Val 
100 . . , 105 no 

Gly Ala Thr Thr Ala Ser He Arg Arg His Val Asp Met Leu Val Gly 
115 , *20 125 

Ala Ala Thr Met Cys Ser .Ala Leu Tyr Val Gly Asp Met Cys Gly Ala 
130 135 140 

Val Phe Leu Val Gly Gin Ala Phe Thr Phe Arg Pro Arg Arg His Gin 
145 * 150 155 160 

Thr Val Gin Thr Cys Asn Cys Ser Leu Tyr Pro Gly His Leu Ser Gly 
165 170 175 

His Arg Met Ala - J 
180 

(2) INFORMATION FOR SEQ ID NO: 15: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 541 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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I 



"1 



(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO, 

(vii) IMMEDIATE SOURCE: 

i ' ( (p) CLONE 1 : HD10-2-14 ' i 

■ i i 

(ix) FEATURE: ■ . 

( A) 1 NAME /KEY : CDS 
(B) • LOCATION: 2.. 541 

i 

(xi) SEQUENCE DESCRIPTION: S&Q ID NO: 15: 

• . S • 

C GTC GGC GCT CCT GTA G£A GGC GTC GCA AGA GCC CTT GCG CAT GGC 46 
Val Gly Alia Pro Val Gly Gly Val Ala Arg Ala Leu Ala pis Gly 
1 ' 5 10 15 

94 



142 



190 



GTG 


AGG 


GCC 


CTT 


GAA 


GAC 


GGG 


ATA 


AAT 


TTC 


GCA 


AC^ 


GGG 


AAT 


TTG 


CCC 


Val 


Arg 


Ala 


Leu 


Glu Asp 


Gly He 


Asn 


Phe 


Ala 


Thr 


Gly 


Asn 


Leu 


Pro 






i 




20 










25 










30 




GGT 


TGC 


TCC 


TTT 


TCT 


ATC 


TTC 


6tt 


CCT 


GCT 


CTG 


TTC 


TCT 


TGC 


TTA 


ATC 


Gly 


Cys 


Ser 


Phe 


Ser* lie 


Phe 


Leu 


Pro 


Ala 


Leu 


P*ie 


Ser 


Cys 


Leu 


He 






1 t 

\ 


35 




i 




i 


40 






i 

i 




45 






CAT 


CCA 


GCA 


GCT 


AGT 


CTA 


GAG 


TGG 


CGG 


AAC 


ACG 


TCT 


GGC 


CTC 


TAT 


GTC 


His 


Pro 


Ala 


Ala 


Ser 


Leu 


Glu 


Trp 


Arg 


Asn 


Thr 


Ser 


Gly 


Leu 


Tyr 


Val 






50 










55 










60 








CTT 


ACC 


AAC 


GAC 


TGT 


TCC 


AAT 


AGC 


AGT 


ATT 


GTG 


TAT 


GAG 


GCC 


GAT 


GAC 


Leu 


Thr Asn 


Asp 


Cys 


Ser 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Asp 


Asp 




65 










70 










75 










GTT 


ATT 


CTG 


CAC 


ACA 


CCC 


GGC 


TGT 


GTA 


CCT 


TGT 


GTT 


CAG 


GAC 


GGT 


AAT 


Val 


He 


Leu 


His 


Thr 


Pro 


Gly Cys 


Val 


Pro 


Cys 


Val 


Gin 


Asp 


Gly 


Asn 


80 










85 










90 










95 


ACA 


TCT ,GCG 


TGC 


TGG 


ACC 


CCA 


GTG 


ACA 


CCT 


ACA 


GTG 


GCA 


GTC 


AGG 


TAC 


Thr 


Ser Ala 


Cys 


Trp Thr 


Pro 


Val 


Thr 


Pro 


Thr 


Val 


Ala 


Val 


Arg 


Tyr 










100 










105 










110 




GTC 


GGA 


GCA 


ACC 


ACC 


GCT 


TCG 


ATA 


CGC 


AGG 


CAT 


GTA 


GAC 


ATA 


TTG 


GTG 


Val 


Gly Ala 


Thr 


Thr 


Ala 


Ser 


lie. 


Arg 


Arg 


His 


Val 


Asp 


He 


Leu 


Val 








115 










120 










125 






GGC 


GCG 


GCC 


ACA 


ATG 


TGC 


TCT 


GCT 


CTC 


TAC 


GTG 


GGT 


GAT 


ATG 


TGT 


GGG 


Gly 


Ala 


Ala 


Thr 


Met 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly Asp 


Met 


Cys 


Gly 






130 










135 










140 








GCC 


GTC 


TTC 


CTC 


GTG 


GGA 


CAA 


GCC TTC 


ACG 


TTC 


AGA 


CCT 


CGT 


CGC 


CAT 


Ala 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Ala 


Phe 


Thr 


Phe 


Arg 


Pro 


Arg 


Arg 


His 




145 










150 










155 










CAA 


ACG 


GTC 


CAG 


ACC 


TGT 


AAC 


TGC 


TCA 


CTG 


TAC 


CCA 


GGC 


CAT 


CTT 


TCA 



238 



286 



334 



382 



430 



478 



526 
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Gin Thr Val Gin Thr Cys Asp Cys Ser Leu Tyr Pro Gly His Leu s'er i 
160 165 ( 170 ' , 175 

. '« 

GGA CAC fcGA ATG GCT ( 541 

Gly His Arg Met Ala - ( 
180 

(2) I NFORMAT I ON FOR SEQ ID NO: 16 : ' 

(i) SEQUENCE CHARACTERISTICS : ' ■ ' 

(A) LENGTH: 18,0 amino acids i 

(B) TYPE: amino acid 

(D)' TOPOLOGY: linear ' 

i ' ' 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: . SEQ ID NO': 16: 

i 

Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val' 
1 5 10 15 

Arg Ala Leu Glu Asp Gly lie Asn, Phe Ala Thr Gly Asn Leu Pro Gly 
2b ' 25 30 ' 

Cys Ser Phe Se,r He Phe Leu Pro Ala Leu Phe Ser Cys Leu He His 

35 . . • .40 is 

Pro Ala Ala ,Ser Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val Leu 

50' »' .' 55 60 

Thr Asn Asp Cys Ser Asn Ser Ser lie Val Tyr Glu Ala Asp Asp Val 
65 70 75 80 

lie Leu His Thr Pro Gly Cys Val Pro Cys Val Gin Asp Gly Asn Thr 
85 90 95 

Ser Ala Cys Trp Thr Pro Val^ Thr Pro Thr Val Ala Val Arg Tyr Val 
100 105 no 

Gly Ala Thr Thr Ala Ser He Arg Arg His Val Asp He Leu Val Gly 
115 120 125 

Ala Ala Thr Met, Cys Ser Ala Leu Tyr Val Gly Asp Met Cys Gly Ala 
130 135 140 

Val Phe Leu Val Gly Gin Ala Phe Thr Phe Arg Pro Arg Arg His Gin 
I 45 150 . • 155 160 

Thr Val Gin Thr Cys Asn Cys Ser Leu Tyr Pro Gly His Leu Ser Gly 
165 170 175 

His Arg Met Ala 
180 

(2) INFORMATION FOR SEQ ID NO: 17 : 
(i) SEQUENCE CHARACTERISTICS : 
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(A) LENGTH: 541 base pairs 

(B) TYPE: nucleic acid * 
H (C). STRANDEDNESS: single ' 

(D) TOPOLOGY: linear ' 

Kii), MOLECULE TYPE: cDNA ' 
(iii) HYPOTHETICAL: NO 1 | • 

<iii) ANTI-SENSE.: NO , . . ' ■ » ' 

■ •' • , • ' ■ 

(vii) IMMEDIATE SOURCE: f 
(B) CLONE: HDlO-3-21 

(ix) FEATURE: 

(A) NAME /KEY : CDS . ( ' 

(B) LOCATION: 2./541' 

1 

1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

C GTC GGC GCT £CT GTA GGA GGC GTC GCA AGA GCC CTT GCG CAT GGC 46 
Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly 
1 5 10 15 

GTG AGG GCC CTT 'GAA GAC GGG ATA AAT TTC GCA ACA GGG AAT TTG CCC ' ' 94 
Val Arg Ala Leu Glu Asp Gly lie Asn Phe Ala Thr Gly Asn Leu Pro 

' , 20 . 25 30 

GGT TGC TCC TTT TCT ATC TTC CTT CTT GCT CTG TTC TCT TGC TTA ATC 142 
Gly Cys Ser Phe .Ser lie Phe Leu Leu Ala Leu Phe Ser Cys Leu lie 
35 40 45 

CAT CCA GCA GCT AQT CTA GAG TGG CGG AAC ACG TCT GGC CTC TAC GTC 190 
His Pro Ala Ala Ser Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val 
50 55 60 

CTT ACC AAC GAC TGT TCC AAT AGC AGT ATT GTG TAT GAG GCC GAT GAC 238 
Leu Thr Asn Asp Cys Ser Asn Ser Ser lie Val Tyr Glu Ala Asp Asp 
65 70 75 

GTT ATT CTG CAC ACA CCC GGC TGT GTA CCT TGT GTT CAG GAC GGT AAT 286 
Val lie Leu His . Thr Pro Gly Cys Val Pro Cys Val Gin Asp Gly Asn 
80 85 90 95 

ACA TCT GCG TGC TGG ACC CCA GTG ACA CCT ACA GTG GCA GTC AGG TAC 334 
Thr Ser Ala Cys Trp Thr Pro Val ' k Thr Pro Thr Val Ala Val Arg Tyr 
100 105 110 

GTC GGA GCA ACC ACC GCT TCG ATA CGC AGG CAT GTA GAC ATA TTG GTG 382 
Val Gly Ala Thr Thr Ala Ser He Arg Arg His Val Asp lie Leu Val 
115 120 125 

GGC GCG GCC ACG ATG TGC TCT GCT CTC TAC GTG GGT GAT ATG TGT GGG 430 
Gly Ala Ala Thr Met Cys Ser Ala Leu Tyr Val Gly Asp Met Cys Gly 
130 135 140 
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GCC GTC TTC CTC GTG GGA CAA GCC TTC ACG TTC AGA CCT CGT CGC CAT 478 

Ala Val Phe Leu Val Gly Gin Ala Phe Thr Phe Arg Pro Arg Arg His' , 

145 150 ■ 155, ■ , 

CAA ACG GTC CAG ACC TGT AAC TGC TCA CTG TAC CCA GGC CAT CTT- TCA 526 

Gin Thr Val Gin Thr Cys Asn Cys. Ser Leu Tyr Pro Gly His Leu Ser 

160 165 i 170 



175 



GGA CAC CGA ATG GCT 
Gly His Arg Met Al«=i 
' 1 180 



(2) INFORMATION FOR SEQ ID NO: 18: 

i 

(i) SEQUENCE CHARACTERISTICS: . 

(A) LENGTH: 180 amino .acids 

(B) TYPE: amino acid i, 
(D) TOPOLOGY: linear 

i 1 
(ii) MOLECULE TYPE: protein 

i , 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: ' 

1 i 

Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val 
1 5 10 15 

i 

Arg Ala Leu Glu Asp Gly He Asn Phe Ala Thr Gly Asn Leu Pro Gly 

' i 20 ,25 30 

Cys Ser Phfe Ser He Phe Leu Leu Ala Leu Phe" Ser Cys Leu He His 
35 40 45 

Pro Ala Ala Ser Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val Leu 
50 55 go 

Thr Asn Asp Cys Ser Asn Ser Ser He Val Tyr Glu Ala Asp Asp Val 
65 70 75 80 

He Leu His Thr Pro Gly Cys Val Pro Cys Val Gin Asp Gly Asn Thr 
85 90 95 

Ser Ala' Cys Trp Thr Pro Val Thr Pro Thr Val Ala Val Arg Tyr Val 
100 105 no 

Gly Ala Thr Thr Ala Ser He Arg Arg His Val Asp He Leu Val Gly 
115 120 125 

Ala Ala Thr Met Cys Ser Ala Leu Tyr Val Gly Asp Met Cys Gly Ala 
130 135 140 

Val Phe Leu Val Gly Gin Ala Phe Thr Phe Arg Pro Arg Arg His Gin 
145 150 i 5 5 ~ ~ 160 

Thr Val Gin Thr Cys Asn Cys Ser Leu Tyr Pro Gly His Leu Ser Gly 
165 170 175 

His Arg Met Ala 



i 



541 
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180 • 
(2) INFORMATION FOR SEQ ID NO: 19: ' ' ' 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54,1 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single ■ 

(D) TOPOLOGY: linear 

,1 ' • < ' ' ' . 

• (ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

1 (iii) ANTI-SENSE: NO 

(vii) IMMEDIATE. SOURCE: 1 

(B) CLONE: BR36-9-13' » 

. i 

(ix) FEATURE: • 
(A) NAME/KEY: CDS 

(b) LOCATION: 2 . . 541 • ( 



(xi) , SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

i ■ 

C GTC GGC GCT CCC GTA GGA GGC' GTC GCA AGA GCC CTT GCG CAT GGC 46 
Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly 
\ 1 1 5 • 10 15 

GTG AGG GCC CTT GAA GAC GGG ATA AAT TTC GCA ACA GGG AAT TTG CCC 94 
Val Arg Ala Leu Glu Asp Gly lie Asn Phe Ala Thr Gly Asn Leu Pro 
20 25 30 

GGT TGC TCC TTT TCT ATT TTC CTT CTT GCT CTG TTC TCT TGG TTA ATT 142 
Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Phe Ser Cys Leu lie 
35 40 45 

CAT CCA GCA GCT AGT CTA GAG TGG CGG AAT ACG TCT GGC CTC TAT GTC 190 
His Pro Ala Ala Ser Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val 
50 55 60 

CTT ACC AAC GAC TGT TCC AAT AGC AGT ATT GTG TAC GAG GCC GAT GAC 238 
Leu Thr Asn Asp Cys Ser Asn Ser Ser lie Val Tyr Glu Ala Asp Asp 
65 70 75 

GTT ATT CTG CAC ACA CCC GGC TGC ATA CCT TGT GTC CAG GAC GGC AAT 286 
Val lie Leu His Thr Pro Gly Cys lie Pro Cys Val Gin Asp Gly Asn 
80 85 90 95 

ACA TCC ACG TGC TGG ACC CCA GTG ACA CCT ACA GTG GCA GTC AAG TAC 334 
Thr Ser Thr Cys Trp Thr Pro Val Thr Pro Thr Val Ala Val Lys Tyr 
100 105 110 

GTC GGA GCA ACC ACC GCT TCG ATA CGC AGT CAT GTG GAC CTA TTA GTG 382 
Val Gly Ala Thr Thr Ala Ser lie Arg Ser His Val Asp Leu Leu Val 
115 120 125 
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I 

GGC GCG GCC ACG ATG TGC TC£ GCG CTC TAC GTG GGT GAT ATG TGt G*3G • 43 0 

Gly Ala Ala Thr Met Cys Ser Al.a Leu Tyr Val Gly Asp Met Cys Gly 
130 • 135 , 'i 140 

•« ' . ■ h 

GCC GTC TTC CTT GTG GGA 'CAA GCC TTC ACG ?TC AGA CCT CGT CGC CAT 4 78 

Ala Val Phe Leu Val Gly Gin Ala Phe Thr Phe Arg pro Arg Arg His 
145 1 150 155 

CAA ACG GTC CAG ACC TGT AAC TGC TCG C^G TAC CCA GGC CAT CTT TCA 526 
Gin ( Thr Val Gin Thr, Cys Asn Cys Ser Leu, Tyr Pro Gly His Leu Ser 
160 ^ ( 165 170 175 

GGA CAT CGA ATG GCT ' . ' 541 

Gly His Arg Met Ala • 
180 1 



(2) INFORMATION FOR SEQ ID -NO: ,20: 

I 

(i) SEQUENCE CHARACTERISTICS': 

(A) LENGTH: 180 amino acids 

(B) TYPE: amo.no acid i 
(D) TOPOLOGY: linear , 

(ii) MOLECULE TYPE: protein • • 

\ 

\ 1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 1 

Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val 

1 t 5 .... X0 15 
I' i 

Arg Ala Leu Glu Asp Gly lie Asn Phe Ala * Thr Gly Asn Leu' Pro Gly 

20 25 30 

Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Phe Ser Cys Leu lie His 
35 • 40 ' 45 

' Pro Ala Ala Ser Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val Leu 
50 55- , 60 

Thr Asn Asp Cys Ser Asn Ser Ser lie Val Tyr Glu Ala Asp Asp Val 
65 70 75 80 

lie Leu His Thr Pro Gly Cys lie Pro Cys Val Gin Asp Gly Asn Thr 
,85 .90 95 

Ser Thr Cys Trp Thr Pro Val Thr Pro Thr Val Ala Val Lys Tyr Val 
100 105 110 

Gly Ala Thr Thr Ala Ser lie Arg 'Ser His Val Asp Leu Leu Val Gly 
115 120 125 

Ala Ala Thr Met Cys Ser Ala Leu Tyr Val Gly Asp Met Cys Gly Ala 
130 135 140 

Val Phe Leu Val Gly Gin Ala Phe Thr Phe Arg Pro Arg Arg His Gin 
145 150 155 160 

Thr Val Gin Thr Cys Asn Cys Ser Leu Tyr Pro Gly His Leu Ser Gly 



BNSDOCID: <WO 9425601 A2_l_> 



SUBSTITUTE SHEET (RULE 26) 



WO 94/25601 



165 



111 

170 



PCT/EP94/01323 



His Arg I^et Ala M 
180 

(2) INFOI^4ATION FOR SEQ ID NO : 21: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 541 base pairs 
* (B) TYPE:' nucleic acid 

, (C) . STRANDEDNESS : single 
(D) TOPOLOGY: linear' 

(ii) MOLECULE ' TYPE : cDNA 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENiSE: NO ' *.' 



(Vii) IMMEDIATE SOURCE: 

(B) CLONE: Ba3'6-9-20 

(ix) FEATURE : 

(A) N^ME/KEY: CDS 

(B) LOCATION: 2 . . 541 



175 



(xi) ( SEQU^NC^ DESCRIPTION' SEQ ID tfO : 21: 

C GTC GGC GCT CCC GTA GGA GGC GTC GCA AGA GCC CTT GCG CAT GGC 
Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly 
1 5 10 15 

GTG AGG GCC CTT GAA GAC GGG ATA AAT TTC GCA ACA GGG AAT TTG CdC 
Val Arg Ala Leu Glu Asp Gly lie Asn Phe Ala Thr Gly Asn Leu Pro 
20 25 30 



46 



94 



GGT TGC TCC TTT TCT ATT TTC CTT CTT GCT CTG TTC TCT TGC TTA ATT 

Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Phe Ser Cys Leu lie 
35 40 45 

CAT CCA GCA GCT AGT CTA GAG TGG CGG AAT ACG TCT GGC CTC TAT GTC 

His Pro Ala Ala Ser Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val 
50 55 60 



142 



190 



CTT ACC AAC GAC TGT TCC AAT. AGC AGT ATT GTG TAC GAG GCC GAT GAC 
Leu Thr Asn Asp Cys Ser Asn Ser &er lie Val Tyr Glu Ala Asp Asp 
65 70 75 



238 



GTT ATT CTG CAC ACA CCC GGC TGC ATA CCT TGT GTC CAG GAC GGC AAT 
Val lie Leu His Thr Pro Gly Cys lie Pro Cys Val Gin Asp Gly Asn 
80 85 90 95 



286 



ACA TCC ACG TGC TGG ACC CCA GTG ACA CCT ACA GTG GCA GTC AAG TAC 
Thr Ser Thr Cys Trp Thr Pro Val Thr Pro Thr Val Ala Val Lys Tyr 
100 105 110 



334 
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*f* TCG ATA CGC AGT <** GTG GAC CTA TTA GTG 

Val Gly Ala Thr Thr Ala S er He Arg Ser His Val Asp £u Su vlt 

120 125, 

GGC GCG GCC ACG ATG TGC TCP GCG CTC TAC GTG GGT GAG ATG TGT GGG 
Gly Ala Ala Thr Met Cys Ser Ala Leu Tyr Val Gly Asp ^ 

' • 135 140 

GCT GTC TTC CTC GTG GGA CAA GCC TTC ACQ TTC AGA CCT CGT CGC CAT 
Ala Val Phe Leu Val Gly din Ala, Phe Thr Phe Arg Pro Arg Arg SI 

Si S T f ° AG TGT ^ TGC TCG CTG ' TAC CCA GGC CAT CTT TCA 

Gin Thr Val Gin Thr Cys Asn Cys Ser Leu Tyr Pro Gly his SI S 



170 



175 



15 



Arg Ala Leu Glu Asp Gly He Asn Phe Ala Thr Gly' Asn Leu Pro Gly 
20 25 



30 



Cys ser Phe Ser He Phe Leu Leu Ala Leu Phe Ser Cys Leu He His 
35 40 45 

Pro Ala Ala Ser Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val Leu 
50 55 60 

Thr Asn Asp Cys Ser Asn Ser Ser He Val Tyr Glu Ala Asp Asp Val 

70 75 80 

He Leu His Thr Pro Gly Cys He Pro Cys Val Gin Asp Gly Asn Thr 

85 90 95 

Ser Thr Cys Trp Thr Pro Val Thr Pro Thr Val Ala Val Lys Tyr Val 
100 105 110 

Gly Ala Thr Thr Ala Ser He Arg Ser His Val Asp Leu Leu Val Glv 
115 120- 125 

Ala Ala Thr Met Cys Ser Ala Leu Tyr Val Gly Asp Met Cys Gly Ala 
130 i35 140 



362 



430 



478 



526 



GGA CAT CGA ,ATG GCT 

Gly His Arg Met Ala / 1 541 

180 ' • 

. . . i 

1 1 

(2) INFORMATION FOR SEQ ID NO: 22: 

• 1 

1 

(i) SEQUENCE CHARACTERISTICS: • 

(A) LENGTH: 180 amino acids 

(B) TYPE: amino acid ' 
(D) TOPOLOGY: linear 

(ii). MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: ' ' 

Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val 

5 10 
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Val Phe Leu Val Gly Gin Ala' Phe Thr Phe Arg Pro Arg Arg His Gin 
145 150 155 1 160 

Thr Val Gin Thr Cys.Ash Cys Ser Leu Tyr Pro Gly His Leu Ser Gly 
165 170 ' 175 

♦ 

His Arg Met Ala 

180 1 l ■ ' , 

i • ■ 

(2) INFORMATION FOR* SEQ ID NQ:. 23: i . • 

(i) SEQUENCE CHARACTERISTICS: 1 . 

(A) LENGTH: 541 base pairs , 

(B) TYPE: nucleic, acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE 1 TYPE : cDNA '.' 

, • • ■' ' ■ 

(iii) HYPOTHETICAL: NO ' 

(iii) ANTI-SENSE: .NO 1 ' 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: BR33-1-10 

it 

(ix) FEATURE: 

(A) ' ,NAr4E/KEY: CDS ' 

(B) LOCATION: 2.^41 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

C GTC GGC GCT CCC GTA GGA GGC GTC GCA AGA GCC CTT GCG CAT GGC ' 46 
Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly 
1 5 10 15 

GTG AGG GCC CTT GAG GAC GGG ATA AAC TTC GCA ACA GGG AAT TTG CCC 94 
Val Arg Ala Leu Glu Asp Gly He Asn Phe Ala Thr Gly Asn Leu Pro 
20 25 30 

GGT TGC TCC TTT TCT ATC TTC CTT CTT GCT CTG TTC TCT TGC TTA ATC 142 
Gly Cys Ser Phe Ser He Phe Leu Leu Ala Leu Phe Ser Cys Leu He 
35 40 45 

CAT CCA GCA GCT GGT CTA GAG TGG CGG AAT ACG TCT GGC CTC TAT GTC 190 
His Pro Ala Ala Gly Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val 
50 55 60 

CTT ACC AAC GAC TGT TCC AAT AGT AGT ATT GTG TAT GAG GCC GAT GAC 238 
Leu Thr Asn Asp Cys Ser Asn Ser Ser He Val Tyr Glu Ala Asp Asp 
65 70 75 

GTT ATT CTG CAC GCG CCC GGC TGT GTA CCT TGT GTC CAG GAC GGC AAT 2 86 

Val He Leu His Ala Pro Gly Cys Val Pro Cys Val Gin Asp Gly Asn 
80 85 90 95 
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ACG TCT ACA TGC TGG ACC CCA GTA ACA CCT ACA GTG GCA GTC AGG TAC 
Thr Ser Thr cys Trp Thr Pro Val Thr Pro Thr Val Ala Val Arg 

100 105 , , 110 , 

GTC GGG GCA ACC ACC GCT TCG ATA CGC AGT' CAT GTG GAC CTG TTA GTA 
Val Gly Ala Thr Thr Ala Ser lie Arg Ser His Val Asp Leu Leu Val 
115 ' 120 125 

GGC GCG GCC ACG ATG TGC , TCT GCG CTT TAC GTG GGT GAT ATG TGT GGG 

•> y fi! ^ M8t 0X8 Ser Ala LSU TV Val G1 y ASP Cys Gly 

' 130 135 140 

' l ^ 2^ ' CTC GGA M GCC TTC ACG TTC AGA CCC CGC CGC CAT 
Ala val Phe ,Leu Val Gly Gin Ala Phe Thr Phe Arg Pro Arg Arg His 
'• 1 145 icn 



60 



Thr Asn Asp Cys Ser Asn Ser Ser lie Val Tyr Glu Ala Asp Asp Val 
65 70 75 " 80 

He Leu His Ala Pro Gly Cys Val Pro Cys Val Gin Asp Gly Asn Thr 
85 90 95 

Ser Thr cys Trp Thr Pro Val Thr Pro Thr Val Ala Val Arg Tyr Val 
100 105 no 

Gly Ala Thr Thr Ala Ser He Arg Ser His Val Asp Leu Leu Val Gly 



334 
I 



382 



430 



478 



526 



145 150 

CAA ACG GTC CAG ACC TGT AAC TGO TCG CTG TAC CCA GGC CAT CTT TCA 
Gin Thr Val Gin Thr Cys Asn Cys Ser Leu Tyr. Pro Gly His Leu IS 
160 l « 170 175 

I 1 

GGA CAT CGC ATG GCT . A . • 

Gly ( His Arg Met Ala , 541 

180 1 

(2) INFORMATION FOR SEQ ID N6: 24: 

i i 
(i) SEQUENCE CHARACTERISTICS: 
1 i (A) LENGTH: 180 amino acids 

\ (B) TYPE': amino acid , 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24 : 1 

Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val 

5 10 15 

Arg Ala Leu Glu Asp Gly He Asn Phe Ala Thr Gly Asn Leu Pro Glv 

20 25 30 . 

Cys Ser Phe Ser He Phe Leu Leu Ala Leu Phe Ser Cys Leu He His 
35 40 45 

Pro Ala Ala Gly Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val Leu 
50 55 



BNSDOCID: <WO 9425601A2_1_> 



SUBSTITUTE SHEET (RULE 26) 



WO 94/25601 PCT/EP94/01323 

115 

* 115 . 120 125 

Ala Ala Thr Met Cys Ser Ala Leu Tyr Val Gly Asp Met Cys Gly Ala 
130 135 . • ■ ■ 140 

» 

Val Phe Leu Val Gly Gin Ala Phe Thr Phe Arg Pro Arg Arg His Gin 
145 150 ' 155 160 

Thr Val Gin Thr Cys Asn Cys Ser Leu Tyr Pro Gly His Leu Ser Gly 
\ ' , 165 170 i 175 

i * ' . . » 

His Arg Met Ala « 

180 . 

(2) INFORMATION FOR SEQ ID NO: 25: 

i 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 541 base ^>airs 

(B) TYPE: nucleic acxfc ' 
. (C) STRANDEDljTESS : single 

■ CD) TOPOLOGY: linear 

»(ii) MbLECULE TYPE: cDNA * • 

(iii) HYPOTHETICAL: NO 

(iiU ANTI-SENSE: NO , 



(vii) 1 IMMEDIATE SOURCE : » 

\(B) CLONE: BR33-1-19 



{ ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 2 . . 541 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

C GTC GGC GCT CCC GTA GGA GGC GTC GCA AGA GCC CTT GCG CAT GGC 46 
Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly 
1 5 10 15 

GTG AGG GCC CTT GAG GAC GGG ATA AAC TTC GCA ACA GGG AAT TTG CCC 94 
Val Arg Ala Leu Glu Asp Gly lie Asn Phe Ala Thr Gly Asn Leu Pro 
20 25 30 

GGT TGC TCT TTT TCT ATC TTC CTT CTT GCT CTG TTC TCT TGC TTA ATC 142 
Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Phe Ser Cys Leu lie 
35 40 45 

CAT CCA GCA GCT GGT CTA GAG TGG CGG AAT ACG TCT GGC CTC TAT GTC 190 
His Pro Ala Ala Gly Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val 
50 55 60 

CTT ACC AAC GAC TGT TCC AAT AGT AGT ATT GTG TAT GAG GCC GAT GAC 23 8 

Leu Thr Asn Asp Cys Ser Asn Ser Ser lie Val Tyr Glu Ala Asp Asp 
65 70 75 
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GTT ATT CTG CAC GCG CCC GGC TGT GTA CCT TGT GTC CAG GAC GGC AAT 
Val lie Leu His Ala Pro .Gl'y Cys Val Pro Cys Val Gin Asp Gly Asn 
80 . 85 .go ' 95 



"I 



GTC GGG GCA ACC ACC GOT TJCG ATA CGC AGT fcAT GTG GAC CTG TTA GTA 
Val, Gly Ala Thr Thr Ala Ser lie Arg Ser His Val Asp Leu Leu Val 
115 120 125 



175 



GGA CAT CGA ATG GCT 
Gly His Arg Met Ala 

, . 180 



(2) INFORMATION. FOR SEQ ID, NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 180 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear . 



(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val 
1 5 io 



15 



Arg Ala Leu Glu Asp Gly He Asn Phe Ala Thr Gly Asn Leu Pro Gly 
20 25 30 

Cys Ser Phe Ser He Phe Leu Leu Ala Leu Phe Ser Cys Leu He His 
35 40, 45 

Pro Ala Ala Gly Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val Leu 
50 55 60 

Thr Asn Asp Cys Ser Asn Ser Ser He Val Tyr Glu Ala Asp Asp Val 
65 70 75 80 

He Leu His Ala Pro Gly Cys Val Pro Cys Val Gin Asp Gly Asn Thr 
85 90 95 



286 



ACG TCT ACA TGC TGG ACC CCA GTA ACA CCT ACA GTG GCA GTC AGG TAC ' 334 

Thr Ser Thr Cys Trp Thr Pro Val Thr Pro ' Thr Val Ala Val Arg Tyr 
100 105 no 



382 



GGC GCG GCC ACG ATG TGc'tCT GCG CTT TAC GTG GGT GAT ATG TGT GGG 430 
Gly Ala Ala Thr Met Cys Ser Ala Leu Tyr Val Gly Asp Met Cys Gly 
130 • ; ' 135 140 ' 

GqC GTC TTC CTC GTG GGA CAA GCC TTC ACG fTC AGA ccc CQC CQC ^ ^ Q 
Ala Val Phe Leu yal Gly Gin Ala Phe Thr Phe Arg Pro Arg Arg His 
145 '150 155 

CAA ACG GTC CAG ACC TGT AAC TGC TCG CTG TAC CCA GGC CAT CTT TCA 526 
Gin Thr Val Gin Thr Cys, Asn Cys Ser Leu Tyr Pro Gly His Leu , Ser 
160 165 i ? 0 



541 
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117 . . 

Ser Thr Cys Trp.Thr Pro Va'l Thr Pro Thr Val Ala Val Arg Tyr Val 

100 105 .-■ no 1 " ' 

i • 

Gly Ala Thr Thr Ala Ser .lie Arg Ser His. Val Asp Leu Leu Val Gly 

115 . 120 125 

t 

Ala Ala Thr Met Cys Ser Ala Leu Tyr Val Gly Asp Met Cys Gly Ala 
130 • 13^ , i 140 . 

Val • Phe Leu Val Giy Gin Ala ,Phe Thr Phe Arg Pro Arg Arg His Gin ■ 
145 , . 150 155 160 

' ■' 1 ■ ' 

Thr Val Gin Thr Cys Asn Cys Ser Leu Tyr Pro Gly His Leu Ser Gly , 

1 : 65 , 170 • 175 1 

His Arg Met Ala 
• 180 

i 1 ' • 

(2) INFORMATION FOR SEQ ID NO: 27: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: '541 base pairs ' 1 
(B.) 7*yPE: nucleic acid i 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

i ■« 

(ii) MOLECULE TYPE : cDNA ■ ( ' ' 

(iii^ HYPOjTHETICAL : NO ' 
(iii) ANTI-SENSE: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: BR33-1-20 
(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 2 . /fe4l 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

C GTC GGC GCT CCC OTA GGA GGC GTC GCA A&A GCC CTT GCG CAT GGC 46 
Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly 
1 5 10 15 

GTG AGG GCC CTT GAG GAC GGG ATA AAC TTC GCA ACA GGG AAT TTG CCC 94 
Val Arg Ala Leu Glu Asp Gly lie lAsn Phe Ala Thr Gly Asn Leu Pro 
20 25 30 

GGT TGC TCT TTT TCT ATC TTC CTT CTT GCT CTG TTC TCT TGC TTA ATC 142 
Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Phe Ser Cys Leu lie 
35 40 45 

CAT CCA GCA GCT GGT CTA GAG TGG CGG AAT ACG TCT GGC CTC TAT GTC 190 
His Pro Ala Ala Gly Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val 
50 55 60 
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XX 2 Z SI S? = £ S 2 S - S - - 2 , 

70 .75 



GTT ATT CTG CAC GCG CCC GGC TOT GTA CCT 'TGT GTC CAG GAC GGC AAT 
Val lie Leu His Ala Pro 'dy Cys Val Pro Cys Val Gin A^p Ty J£ 

■JS ACC ' CCA ° TA ACA CCT ACA GTG GCA GTC'AGG TAC 

■Thr ser Thr Cys Trp Thr Pro Val Thr Pro Thr Val Ala V^ A^g Sr 

100 105 110 

vll 25 2* if A k° TC ° ATA CGC AGT '«f GTG GAC CTG TTA GTA 

Val Gly Ala Thr Thr Ala Ser lie Arg Ser His Val Asp' Leu Uu vli 

115 12 0 125 

GGC GCG GCC ACQ ATG TGC TCT GCG-CTT TAC GTG GGT GAT ATG 'TGT GGG 
Gly Ala Ala Thr Met Cys Ser Ala- Leu Tyr Val. Gly, AsJ Met £s Gly 

til 2? 11° T ° TC GGA ^ GCC ACG TTC A ° A CCC CGC CGC CAT 

Ala Val Phe Leu Val Gly Gin Ala Phe Thr Phe Arg Pro Arg, Arg SI 

' 150 , 155 

J? GTC CAG ACC TGT AAC TGC TCG CTG TAC CCA . GGC CAT CTT TCA 
Gin Thr, Val Gin Thr Cys Asn Cys Ser Leu Tyr Pro Gly Ss Su ITr 

165 i 170 



175 ' 



Cys Ser Phe Ser He Phe Leu Leu Ala Leu Phe Ser Cys Leu He His 
35 40 45 

Pro Ala Ala Gly Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val Leu 
su 55 



60 



Thr Asn Asp Cys Ser Asn Ser Ser He Val Tyr Glu Ala Asp Asp Val 

70 75 



236 



266 



334 



382 



430 



478 



526 



GGA CAT* CGA ATG GCT , • » ' 

Gly His Arg Met Ala • ' 541 

180 

(2) INFORMATION FOR SEQ ID NO: 28: 

( i ) SEQUENCE CHARACTER I STICS : 

(A) LENGTH: 180 amino acids 

(B) TYPE: amino acid 
. (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val 

5 10 15 

Arg Ala Leu Glu Asp Gly He Asn Phe Ala Thr Gly Asn Leu Pro Gly 
20 25 



80 
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ii He Leu His Ala Pro Gly Cys Val Pro Cys Val Gin Asp Gly Ash' Thr • 

85 90 . ■ . ' 95 

Ser Thr Cys Trp Thr Pro Val Thr -Pro Thr Val Ala Val Arg Tyr Val 
100 , 105 110 

Gly Ala Thr Thr Ala Ser ile Arg Ser His Val Asp Leu Leu Val Gly 
115 120 125 i 

• i . ' ■ 

i Ala Ala Thr, Met Cys Ser Ala Leu ' Tyr Val Gly Asp Met Cys Gly Ala 
' 130 ' 135 140 

.«, Val Phe Leu Val Gly Gin Ala Phe Thr Phe Arg Pro Arg Arg His Gin 
^ 145 l 150 155 160 

\ , Thr Val Gin Thr Cys Asn Cys Ser Leu Tyr Pro Gly His Leu Ser Gly' 

1 * 165 1 170 175 

His Arg. Met Ala 

i 180 ' 1 • 

• • . 

(2) I,NFOR>iAT I ON FOR SEQ ID NO: 29 : , 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH; 287 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

\ ' ' • . 1 

(ii) fopLECULE TYPE: CDNA » ■ 

(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: HCC1153 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 3,. 287 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

TA GAC TTT TGG GAG AGC GTC TTC ACT GGA CTA ACT CAC ATA GAT GCC 47 
Asp Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His lie Asp Ala 
1 5 10 15 

CAC TTT CTG TCA CAG ACT AAG CAG CAG GGA CTC AAC TTC TCG TTC CTG 95 
His Phe Leu Ser Gin Thr Lys Gin Gin Gly Leu Asn Phe Ser Phe Leu 
20 25 30 

ACT GCC TAC CAA GCC ACT GTG TGC GCT CGC GCG CAG GCT CCT CCC CCA 143 

Thr Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro 

35 40 45 

AGT TGG GAC GAG ATG TGG AAG TGT CTC GTA CGG CTT AAG CCA ACA CTA 191 
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Ser Trp Asp Glu Met Trp Lys Cys Leu Val Arg Leu Lys Pro Thr Leu i 
SO 55 60 

CAT GGA'-CCT ACG CCT CTT CTA'i TAT CGG TTG GGG CCT GTC CAA AAT GAA 23 9 

His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Pro Val Gin Asn Glu 
65 70 75 



ATC TGC TTG ACA CAC CCC ATC ACA AAA TAC ATC ATG GCA TGC ATG TCA 
lie Cys Leu Thr His Pro lie Thr Lys Tyr 'lie Met Ala Cys Met Ser 
8Q '• ■ 85 ■ , 90 • 95 

I ' 

(2) INFORMATION "FOR SEQ ID NO: 30: 1 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 95 atnino acids ' 
« (B) TYPE: amino acid , . 

(D) TQPOLOGY: linear 

i 

(ii) MOLECULE TYPE: protein, 1 

(xi) SEQUENCE DES/CJUPTION: SEQ ID' NO: 30: V 

Asp Phe Trp* Glu Ser Val Phe Thr Gly Leu Thr His He Asp Ala His 
1 5 10 . is 

Phe Leu Ser Gin Thr Lys Gin Gin Gly Leu Asn Phe £*er Phe Leu Thr 
' , 20 25 30 

Ala Tyt Gin 'Aid Thr Val Cys ' Ala Arg Ala Gin Ala Pro Pro Pro Ser 
35 * 40 ' 45 

Trp Asp Glu Met Trp Lys Cys Leu Val Arg Leu Lys Pro Thr Leu His 
50 55 . 60 

Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Pro Val Gin Asn Glu He 
65 70 75 80 

Cys Leu Thr His Pro lie Thr Lys Tyr He Met Ala Cys Met Ser 

85 • so * 95 , 

(2) INFORMATION FOR SEQ ID NO: 31: 

' (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 401 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear. 

(ii) MOLECULE TYPE : cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI- SENSE: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE : HD10-1-25 



287 
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(ix) FEATURE: 

(A) NAME/KEY: CDS ' ' ' 1 

(B) LOCATION: 3. .401 • 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

» 

TC CAA AAT GAA ATC TGC TTG ACA CAC CCC GTC ACA AAA TAC ATT ATG 4 7 

i ' Gin Asn piu lie 'Cys Leu Thr ftis Pro Val Thr Lys Tyri lie Met 

; ; i . ,5 10 15 , 

GCA TGC ATG TCA GCT GAT CTG GAA GTA ACC ACC AGC ACC TGG GTG TTG 95 
Ala Cys Met Ser Ala Asp Leu Giu Val Thr Thr Ser Thr Trp Val Leu 
20 25 30 

1 CTT GGA GGG GTC CTC GCG GCC CTA fcCG GCC TAC TGC TTQ TCA (STC GGC 143 
Leu Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Val Gly 
35 ( 40 .45 

i ' ' 

TGC GTT GTA ATC GTG GGT CAT ATC GAG CTG GGG GGC AAG CCG GCA CTC 191 
Cys Val Vai lie Val Gly His lie Glu Leu Gly Gly Lys Pro Ala Leu 
50 35 , 60 

i 

GTT CCA \GAC AAG GAG GTG TTG TA.T CAA CAG TAC GAT GAG ATG GAG GAG 23 9 

Val Pro Asp Lys Glu Val Leu Tyr Gin Gin Tyr Asp Glu Met Glu Glu i 
65 70 ' 75 

\ TGC TCG 'cAA GCC GCC CCA TAC 'ATC GAA CAA GCT CAG GTA ATA GCC CAC 287 
Cys Ser Gin Ala Ala Pro Tyr He Glu Gin Ala Glri Val He Ala His 
80 85 90 95 

CAG TTC AAG GAG AAA ATC CTT GGA CTG CTG CAG CGA GCC ACC CAA CAA 335 
Gin Fhe Lys Glu Lys He Leu Gly Leu Leu Gin Arg ( Ala Thr Gin Gin 
100 105 110 

CAA GCT GTC ATT GAG CCC GTA ATA GCT TCC AAC TGG CAA AAG CTT GAA 383 
Gin Ala Val lie Glu Pro Val He Ala. Ser Asn Trp Gin Lys Leu Glu 
" 115 120 125 

ACC TTC TGG CAC AAG CAT 401 
Thr Phe ,Trp His Lys His 
130 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 3 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Gin Asn Glu He Cys Leu Thr His Pro Val Thr Lys Tyr He Met Ala 
1 5 10 15 



BNSDOCID: <WO 9425601 A2_L> 



SUBSTITUTE SHEET (RULE 26) 



WO 94/25601 

122 . PCI7EP94/01323 



cys Met ser Ala Asp Leu. G iu V a i Thr Thr Ser Thr Trp ^ ^ ^ ' 

- , 25 ' ' 30 

Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr. Cys Leu Ser Val Gly Cys 

. 40 I 45 

Val val lie Val Gly His lie Glu Leu Gly *l y L ys Pro Ala Leu Val 

♦ , 5 60 

Pro Asp, Lys .Glu Val Leu Tyr' Gin Gin Tyr Asp Glu Met Glu Glu Cys 



' 7 * . eg 

ser Gin Ala Ala Pro Tyr I^e Glu Gin Ala Gin Val He Ala His, Gin 

Phe Lys Glu Lys lie Leu Gly Leu Leu Gin Arg Ala Thr Gin Gin Gin 



110 



Ala val lie Glu Pro Val tie Ala- Ser Asn Trp Gin Lys Leu Glu Thr 

** 120 T)r 

Phe Trp His Lys His. n ' * 
130 



(2) INFORMATION FOR SEQ ID NO: 33: 
i ., 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 01 base pairs 

(B) TYPE: nucleic .acid 

(C) sVRANDEDNESS': single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 
(iii) HYPOTHETICAL: NO ' 
(iii) ANTI-SENSE: NO 



(Vii) IMMEDIATE SOURCE: 

(B) CLONE: HD10-1-3 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 3. .401 



(Xi) SEQUENCE DESCRIPTION : ;SEQ ID NO : 33: 



TC CAA AAT GAA ATC TGC TTG ACA CAC CCC GTC ACA AAA TAC ATT ATG 
Gin Asn Glu He . Cys Leu Thr His Pro Val Thr Lys J£ S £2 



10 15 



GCA TGC ATG TCA GCT GAT CTG GAA~GTA ACC ACC AGC ACC TGG GTG TTG 
Ala Cys Met Ser Ala Asp Leu Glu Val Thr Thr Ser Thr J£ 52 Su 

CTT GGA GGG GTC CTC GCG GCC CTA GCG GCC TAC TGC TTG TCA GTC GGC 



47 



95 



143 
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i 

Leu Gly Gly Val Leu Ala Al'a Leu Ala Ala Tyr Cys Leu Ser Val Gly 
35 40 45 ' 

TGC GTT GTA ATC GTG GGT .CAT ATC GAG CTG GGG GGC AAG CCG GCA CTC '191 
Cys Val Val lie Val Gly His lie Glu Leu 'Gly Gly Lys Pro; Ala Leu 
,50 55 ' 60 

GTT CCA GAC AAG GAG GTG Tt6 TAT CAA CAG TAC GAT GAG ATG GAG GAG 239 
Val Pro Asp Lys Glu Val L'eii Tyr Gin Gin Tyr Asp Glu Met Glu Glu 

• 65 '* . 1 70,. i ■ 75 . ( » 

TGC TCG CAA -GCC GCC CCA i TAC ATC GAA CAA GCT CAG GTA ATA GCC CAC . 287 
Cys Ser Gin Ala Ala Pro Tyr lie Glu Gin Ala Gin Val He Ala His 
80 85 , 90 • »95 



i 



CAG TTC AAG GAG AAA ATC CTT GGA CTG CTG CAG CGA GCC ACC CAA CAA 335 

Gin Phe Lys Glu Lys He Leu Gly Leu Leu <31n Arg Ala Thr Gin Gin 

tLOO 1 '.' 105 110 

i 

CAA GCT GTC ATT GAG CCC G^TA ATA 'GCT TCC AAC TGG CAA AAG CTT GAA 383 
Gin Ala Val He Glu Pro Val He Ala Ser Asn Trp Gin Lys Leu Glu 

115 . » ' 120 ' •' 125 ' 

ACC TTC TGG CAC AAG CAT , 401 

Thr Phe Trp His Lys His 

130 1 , 



(2) INFORMATION | FOR SEQ ID NO: 34: 

i 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 133 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

Gin Asn Glu He Cys .Leu Thr His Pro Val Thr Lys Tyr He Met Ala 
1 .5 10 15 

Cys Met Ser Ala Asp Leu Glu Val Thr Thr Ser Thr Trp Val Leu Leu 
20 25 30 

Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Val Gly Cys 
35 40 45 

Val Val He Val Gly His He Glu Leu Gly Gly Lys Pro Ala Leu Val 
50 55 60 

Pro Asp Lys Glu Val Leu Tyr Gin Gin Tyr Asp Glu Met Glu Glu Cys 
65 70 75 80 

Ser Gin Ala Ala Pro Tyr He Glu Gin Ala Gin Val lie Ala His Gin 
85 90 95 

Phe Lys Glu Lys He Leu Gly Leu Leu Gin Arg Ala Thr Gin Gin Gin 
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100 105 ' no . ' 

Ala Val lie Glu Pro Val lie Ala Ser Asn TrpGlh.Lys Leu. Glu Thr ' 



125 



"5 120 

Phe Trp His Lys His 

130 , , 

(2) INFORMATION FOR SEQ ID NO: 35: 

'! , (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 01 base pairs 
.. (B) TYPE: nucleic acid 

\ (C), STRANDEDNESS : single 

I i (D) TOPOLOGY: linear 

1 (ii) MOLECULE T^YPE : cDNA 

(iii) HYPOTHETICAL: NO 

i 

(iii) ANTI-SENSE: NO 



. (vii) IMMEDIATE SOURCE: ' . 
(B) CLONE: BR36-20-164 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
♦ i (B) LOCATION: 3.. 401 

. \ ■ , 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

TC Sf* TTG ACA ^ CCC ATC ACA *»* TAG ATC ATG 

Gin Asn Glu He Cys Leu Thr His Pro He Thr Lys Tyr He Met 

5 10 15 

GCA TGC ATG TCA GCT GAT CTG GAA GTA ACC ACC AGC ACC TGG GTT TTG 
Ala cys Met Ser Ala Asp Leu Glu Val Thr Thr Ser Thr Trp Val III 
20 25 3 0 

CTT GGA GGG GTC CTC GCG GCC CTA GCG GCC TAC TGC TTG TCA GTC GGT 
Leu Gly'Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu vlt Ty 

35 40 



45 



TGT GTT GTG ATT GTG GGT CAT ATC GAG CTG GGG GGC AAG CCG GCA ATC 
Cys Val Val lie Val Gly His He Glu Leu « y Gly Lys Pro aS Tie 
50 55 



60 



GTT CCA GAC AAA GAG GTG TTG TAT CAA CAA TAC GAT GAG ATG GAA GAG 
Val Pro Asp Lys Glu Val Leu Tyr Gin Gin Tyr Asp Glu Met Glu Glu 
65 70 



75 



TGC TCA__gAA GCT GCC CCA TAT ATC GAA CAA GCT CAG GTA ATA GCT CAC 
Cys Ser Gin Ala Ala Pro Tyr He Glu Gin Ala Gin Val He Ala His 
80 85 *0 95 

CAG TTC AAG GGA AAA GTC CTT GGA TTG CTG CAG CGA GCC ACC CAA CAA 



47 



95 



143 



191 



239 



287 . 



335 
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j| Gin Phe Lys Gly Lys Val Leu Gly Leu Leu Gin Arg Ala Thr Gin' Gin • 

100 105 110 

CAA GCT GTC ATT. GAG CCC ATA GTA ACT ACC AAC TGG CAA AAG CTT GAG 383 
Gin Ala Val He Glu Pro He Val Thr Thr Asn Trp Gin Lys Leu Glu 
115 120 125 

GCC TTT TGG CAC AAG CAT i 4 01 

Ala Phe Trp His Lys His i 

,i ' x3 ,°i 1 . ■ 

(2) INFORMATION FOR SEQ ID*NO: 36 : 
1 

' (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 133 amino acids 
1 (B*) TYPE:' amino acid 1 ' 

(D) TOPOLOGY: linear '» • 

(ii) MOLECULE TYPE :' protein • 

Cxi) skQUENCE DESCRIPTION: SEQ ID 'NO: 36: i 

Gin Asn Glu He Cys Leu Thr His Pro He Thr Lys Tyr He Met Ala t 
1 5 ,10 15 

. ■ i • 

Cys Met Ser Ala Asp Leu Glu Nfal Thr Thr Ser Thr Trp Val Leu Leu 
20 25 , 30 

Gly Gly Val, Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Val Gly Cys 
35 40 45 

Val Val He Val Gly His He Glu Leu Gly Gly Lys Pro Ala He Val 
50 55 60 

Pro Asp Lys Glu Val Leu Tyr Gin Gin Tyr Asp Glu Met Glu Glu Cys 
65 70 75 80 

Ser Gin Ala Ala Pro Tyr He Glu Gin Ala Gin Val lie Ala His Gin 
85 90 95 

Phe Lys ,Gly Lys Val Leu Gly Leu Leu Gin Arg Ala Thr Gin Gin Gin 
100 105 110 

Ala Val He Glu Pro He Val Thr Thr Asn Trp Gin Lys Leu Glu Ala 
115 120 125 

Phe Trp His Lys His 
130 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 401 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: qDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(vii). IMMEDIATE SOURCE": 

(B) CLONE: BR36-20-166 ' ' 
t '• 

(ix) FEATURE: r ' 

(A) NAME/KEY; CDS 1 , 

(B) LOCATION: 3.. 401 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37 



TC CAA AAT GAA ATC TGC TTG AGA CAC CCC ATC ACA AAA TAC ATC ATG 
Gin Asn Glu lie Cys 'Leu Thr His Pro lie Thr Lys Tyf S Set 
5 " 10 15 

GCA TGC ATG TCA GCT GAT CTG GAA GTA ACC/ ACC AGC ACC TGG GTT TTG 
Ala cys Me, Ser Ala Asp Leu Glu Val W Thr Ser Thr IZ vll ™ 

20 . J- 



25 30 



17 ^ 2? G GCG GCC CTA ° CG GCC TGC TTG TCA GTC GGT 

Leu Gly Gly, Val Leu Ala Ala Leu Ala Ala Tyr Cys ieu s£ VaT Sy 

5 S S 5 s - - s ~ « g - «« 

55 60 

vll pS ^ TAC GAT ATG GAA GAG 

Val Pto Asp Lys Glu Val Leu Tyr Gin Gin Tyr Asp Glu Met Glu, Glu 

' 70 75 

^ GCC CCA TAT ATC GAA ^ <* T CAG GTG ATA GCT CAC 

Cys ser Gin Ala Ala Pro Tyr He Glu Gin Ala Gin Val £ £a £J 

83 90 95, 

CAG TTC AAG GAA AAA GTC CTT GGA TTG CTG CAG CGA GCC ACC CAA CAA 
Gin Phe Lys Glu Lys Val Leu Gly Leu Leu Gin Arg Ala S flta X 



100 105 



110 



47 



95 



143 



191 



239 



287 



335 



383 



C^A GCT GTC ATT GAG CCC ATA GTA ACT ACC AAC TGG CAA AAG CTT GAG 
Gin Ala Val lie Glu Pro He Val Thr Thr Asn Trp 22 £s Su 5S 

115 J 120 125 

GCC TTT TGG CAC AAG CAT 

Ala Phe Trp His Lys His 401 
130 

(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 133 amino acids 
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(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: .protein 1 

<xi), SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

Gin Asn Glu lie Cys Leu Thr His Pro lie Thr LySs Tyr lie Met Ala 
1 5 ' 10 ,15 

Cys Met , Ser Ala Asp Leu Glu Val Thr Thr Ser Thr Trp Val Leu Leu 

' 20 / I ■ 25 1 .30 . 

Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Val Gly Cys 
35 . 40 45 

Val Val lie Val Gly His He Glu Leu Gly Qly Lys Pro Ala He Val 

50 1 ( 5£ '.' 60 

Pro Asp Lys Glu Val Leu Tyr Gin Gin Tyr Asp Glu Met Glu Glu Cys 
65 70 75 80 

Ser Gin Ala Ala Pro Tyr He Glu* Gin Ala Gin Val He Ala Hist Gin 
85 90. / 95 . 

Phe Lys Glu Ly»s Val Leu Gly Leu Leu Gin Arg Ala Thr Gin Gin Gin 
l'00 ' 105 , 110 

Ala Val ( He G,lu ( Pro He Val 1*hr Thr Ash Trp G\n Lys Leu Glu Ala 
' 115 ' 120 ., 125 

Phe Trp His Lys His 
130 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 4 01 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: BR36-20-165 

(ix) FEATURE : 

(A) NAME /KEY: CDS 

(B) LOCATION: 3. .401 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
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i I 



10 



15 



GCA TGC ATG TCA OCT GAT ' CTG GAA GTA ACC ACC AGC ACC TGG GTT TTG 
Ala cys Met Ser Ala As* -Leu Glu Val Thr Thr Ser Thr ™ vll 



20 

25 , 30 



95 



143 



191 



239 



287 



335 



383 



\ 2 2 3 2 22 2 2 £ 2 2 .2. 2 2 2 2 
22 2 22 2 2 2 2 2 22 22 2 2 

55 60 

2 2 2 2 2 2 2 22 222 2 2 22 
2 2 2 2 2 2 2 2 22 2 22 2 2 2 

90 95 
CAG TTC AAG GAA AAA GTC CTT GGA TTG CTG CAG CGA GCC ACC CAA M1 
Gin Phe Lys Glu Lys Val Uu,«l y L eu L eu Gin Ar* S Thr 22 SE ' 
100 "S 110 , 

2 £ 2 2 - 2 22 2 2 2 2 2 2 2 2 

115 12 ° 125 

GCC TTT TGG CAC AAG CAT 

Ala Phe Trp His Lys His .' • 401 

130 

(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 133 amino acids 

(B) TYPE: amino acid 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

Gin Asn Glu He Cys Leu Thr His Pro He Thr Lys Tyr He Met Ala 

5 10 15. 

Cys Met Ser Ala Asp Leu Glu Val Thr Thr Ser Thr Trp Val Leu Leu 
20 2 5 30 

Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Val Gly Cys 
" 40 



45 
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I 



Val Val lie Val Gly His lie Glu Leu Gly Gly Lys Pro Ala lie Val 
50 55 * 60 

« , 
Pro Asp Lys Glu Val Leu Tyr Gin Gin Tyr . Asp Glu Met Glu Glu Cys 

65 70 75 80 

i 

Ser Gin Ala Ala Pro Tyr lie Glu Gin Ala Gin Val lie Ala His Gin 
85 90 I ' . 95 

Phe 'Lys Glu Lys Val" Leu Gly ( Leu Leu Gin' Arg Ala Thr Gin Gin Gin 
, 100 t . 105 ( 110 

Ala Val lie Glu Pro lie Val Thr Thr Asn Trp Gin Lys Leu Glu Ala 
115 : i 120 125 

Phe Trp His Lys His 
* 130 

i • 

(2) INFORMATION FOR SEQ I& NO: 4l: 

(i) SEQUENCE CHARACTERISTICS: ». 

(A) LENGTH: , 509 base pairs i 

(B) TYPE: nucleic acid , . 

(C) S>TRANDEDNESS : single 

(D) TOPOLOGY: linear . 

i ♦ * 

i 

(ii) MOLECULE TYPE: cDNA 
i 1 
i I' i 

(iii) HYPOTHETICAL: NO 

(iii) ANTI- SENSE: NO 



(vii) IMMEDIATE SOURCE: 
(B) CLONE: PC-2-1 

(ix) FEATURE : 

(A) NAME /KEY : CDS 

(B) LOCATION: 3.. 509 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

CC ATG AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC AAA AGA AAC ACC 4 7 

Met Ser Thr Asn Pro Lys Pro' Gin Arg Lys Thr Lys Arg Asn Thr 
1 5 10 15, 

AAC CGT CGC CCA CAG GAC GTC AAG TTC CCG GGC GGT GGT CAG ATC GTT 95 
Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val 
20 25 30 

GGC GGA GTT TAC TTG TTG CCG CGC AGG GGC CCT AGG ATG GGT GTG CGC 143 
Gly Gly Val Tyr Leu Leu Pro Arg~Arg Gly Pro Arg Met Gly Val Arg 
35 40 45 

GCG ACT CGG AAG ACT TCG GAA CGG TCG CAA CCC CGT GGA CGG CGT CAG 191 
Ala Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin 
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50 55 

* s 60 

= 2 = s:.= s z s.s g - a s 



70 75 



Gly £r S Pro 2f ^ !f C ' JttT GAG GGC CTC « TGG OCA « 

B? ^ ° ^ ^ Ala ASn Glu G1 y'Leu Gly Trp Ala Gly 



90 95 



TGG CTG CTC 1 TCC CCT CGA GGC TCTCGG CCT ART TGG rrr nrin *™ 

«P « Ser JJ. « g Wy _ „ « » £ « « £T «. 



150 155 , 



(2) INFORMATION FOR SEQ ID NO; 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 169 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 

10 15 
Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 

25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Met Gly Val Arg Ala 

40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 

55 60 
lie Pro Lys Ala Arg Gin Pro Thr Gly Arg Ser Trp Gly Gin Pro Gly 

70 75 80 

Tyr Pro Trp Pro Leu Tyr AlaJVsn Glu Gly Leu Gly Trp Ala Gly Trp 



239 



287 



335 



383 



S 5 Z £ = 5 Z 2 £ - - - » « s 

115 1120 125 , 

i , : 135 140 

5 1 5 3 SSEES ~ 23 S S = ' = - - 



GAC GGG GTA AAC TAT GCA ACA GGG AAT TTA 

Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu 509 

160 1 165 , i ■ ' 
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h 85 90 9S' ' 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Asn Trp Gly Pro Asn' Asp' Pro 
100 105 , 110 

Arg Arg Lys Ser Arg Asn ( Leu Gly Lys Val lie Asp Thr Leu Thr Cys 
115 ' 120 125 

t 

Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val . Gly Gly Pro lie 
| ' 130 , • 135 ' 140 • 

' Gly Gly Val Ala Arg Ala Leu Ala His Gly, Val Arg Val Leu Glu Asp 
145 . 150 155 . 160 

\ 

' Gly Val Asn Tyr Ala Thr Gly Asn Leu 

165 

l • , i . 

( 2) INFORMATION FOR SEQ ID NO i 43*«: . 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 509 • base pairs 

, '(B) TYPE: nucleic acid 1 • 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear , 
(ii) • MOLECULE TYPE: cDNA 

i • 

(iii) HYPOTHETICAL : NO 

\ ' 1 ' , , 

(iii) ^TI-SENSE: NO 1 . 

(Vii) IMMEDIATE SOURCE: 

(B) CLONE: PC-2-6 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 3 . • 509 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

CC ATG AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC AAA AGA AAC ACC 47 
Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr 
15 10 15 

AAC CGT CGC CCA CAG GAC GTC AAG TTC CCG GGC GGT GGT CAG ATC GTT 95 
Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val 
20 25 30 

GGC GGA GTT TAC TTG TTG CCG CGC AGG GGC CCT AGG ATG GGT GTG CGC 143 
Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Met Gly Val Arg 
35 40 45 

GCG ACT CGG AAG ACT TCG GAA CGG TCG CAA CCC CGT GGA CGG CGT CAG 191 
Ala Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin 
50 55 60 
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CCT ATT CCC AAG GCG CGC CAG CCC ACG GGC CGG TCC TGG GGT CAA CCC ' 

Pro He Pro Lys Ala Arg Gin Pi-o Thr Gly Arg Ser Trp Gly Gin Pro 

65 . 70 ■ , ' 75 



GGG TAC CCT TGG CCC CTT TAC GCC AAT GAG GGC CTC GGG TGG GCA GGG 
Gly Tyr Pro Trp Pro Leu Tyr Ala Asn Glu Gly Leu Gly Trp Ala Glv 
80 • . 8 5 90. ' 95 

TGG CTG CTC TCC CCT CGA GGC TCT CGG CCT AAT TGG GGC CCC AAT GAC 
Trp, Leu Leu Ser Pro Arg Gly Ser Arg Pro Asn Trp Gly Pro Asn Asp 
. 100 ' 105 110 



CCC CGG CGA AAA TCG CGT 1 AAT TTG GGT AAG GTC ATC GAT ACC CTA ACG 
Pro Arg Arg Lys Ser Arg Asn Leu Gly Lys Val He Asp Thr Leu Thr 
115 ■ ; ' 120 ' i 25 

TGC GGA TTC GCC GAT CTC ATG GGG TAT ATC CCG CTC GTA GGC GGC CCC 431 
Cys Gly Phe Ala Asp Leu Met G3,y Tyr lie ' Pro Leu Val Gly Gly Pro 
130 i 135 140 



i 



i 



ATT GGG GGC GTC GCA AGG GCT CTC GCA CAC GGT GTG AGG GTC CTT GAG 
He Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu, Glu 
145 150 is5 

GAC GGG GTA AAC TAT GCA ACA GGG AAT TTA , ■ 

Asp Gly Val Asti Tyr Ala Thr Gly Asn Leu 
160 , ( 



(2) INFORMATION' FOR SEQ ID. NO: 44 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 169 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
1 5 10 is 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Met Gly Val Arg Ala 
35 40, 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 go 

He Pro Lys Ala Arg Gin Pro Thr Gly Arg Ser Trp Gly Gin Pro Gly 
65 70 75 ~ 80 

Tyr Pro Trp Pro Leu Tyr Ala Asn Glu Gly Leu Gly Trp Ala Gly Trp 
85 90 95 



239 



287 



335 



383 



479 



509 
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Leu Leu Ser Pro Arg Gly Ser Arg Pro Asn Trp Gly Pro Asn Asp Pro 
100 105 * HO 

Arg Arg Lys Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu Thr Cys 

115 ■ 120 125 

i 

i 

Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Gly Pro lie 
130 135 1 , I 140 

Gly Gly Val Ala Ar^ Ala Leu £la His Gly i Val Arg Val Leu' Glu Asp 
145 .. . ■ 150 155 160 

• ■ ; 

Gly Val Asn Tyr Ala Thr Gly Asn Leu 
165 , 



(2)* INFORMATION FOR SEQ ID NO: 45: 

\ ' . ' 

(i) SEQUENCE CHARACTERISTICS : , 

(A) ^ LENGTH: 580 base patirs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS :' single • 

(D) , T,OPOLOGY: linear • i 



i 



(ii) MOLECULE TYPE: cDNA 

i 

(iii) HYPOTHETICAL: NO 



(iii) ANTI -j,SENSE : NO 



(vii) IMMEDIATE SOURCE: 
(B) CLONE: PC-4-1 

• . * 
(ix) FEATURE: , 

(A) NAME/KEY : CDS 

(B) LOCATION: 2.. 580 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

A ACG TGC GGA TTC GCC GAT CTC ATG GGG TAT ATC CCG CTC GTA GGC 46 
Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly 
1 " 5 10 15 

GGC.CCC ATT GGG GGC GTC GCA AG<3 GCT CTC GCA CAC GGT GTG AGG GTC 94 
Gly Pro lie Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val 
20 , 25 • 30 

CTT GAG GAC GGG GTA AAC TAT GCA ACA GGG AAT TTA CCC GGT TGC TCT 142 
Leu Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser 
35 40 45 

TTC TCT ATC TTT ATT CTT GCT CTT -CTC TCG TGT CTG ACC GTT CCG GCC 190 
Phe Ser lie Phe lie Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala 
50 55 60 

TCT GCA GTT CCC TAC CGA AAT GCC TCT GGG ATT TAT CAT GTT ACC AAT 23 8 
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i Ser Ala Val Pro Tyr Arg Asn Ala Ser Gly He Tyr His Val Thr Asn 

65 , 70 ... 75 



"I 



GAT TGC CCA AAC TCT TCC ATA GTC TAT GAG GCa' GAT AAC CTG ATC CTA 
Asp Cys Pro Asn Ser Ser He Val Tyr Glu Ala Asp Asn Leu He' Leu 
80 85' 90 95 



GTC ACG GCT CCT CTT CGG AGA GCC GTT GAC TAC CTA GCG GGA GGG GC? 
Val Thr Ala- Pro Leu Arg Arg Ala .Val Asp Tyr Leu Ala Gly Gly Ala 
130 135., ,140 

GCC CTC TGC, TCC GCG TTX TAC GTA GGA GAC GCG TGT GGiG GCA CTA TTC 
Ala Leu Cys Ser Ala Leu Tyr Val Gly Asp Ala Cys Gly Ala Leu Phe 

V* 5 ' ' 150 , ■ 155 



ATG GCA 
Met Ala 



286 



CAC GCA CCT GGT TGC GTG CCT TGT GTC ATG ACA' GGT AAT GTG AGT AGA 334 
His Ala Pro Gly Cys Val fro Cys Val Met Thr Gly Asn Val Ser Arg 
, , 10Q ' 105 ,110 

T6C TGG GTC CAA ATT ACC CCT ACA CTG TC^ GCC CCG AGC CTC GGA GCA " ' ' 382 
Cys Trp Val qin He Thr Pro Thr Leu 'Ser Ala Pro Ser Leu Gly Ala 
*15 120 125 



430 



478 



TTG GTA GGC CAA ATG TTC ACC TAT AGG CCT CGC cXg CAC GCT ACG GTG 526 
Leu Val t Gly Gin Met Phe Thr Tyr Arg Pro Arg Gin ,His Ala Thr Val 
160 165 170 175 

i i ■ 

CAG AAC TGC AAC TGT TCC ATT TAC AGT GGC CAT GTT ACC GGC CAC CGG 574 
Gin Asn' Cys Asn Cys Ser He* Tyr Ser Gly His , Val Thr Gly His Arg 
' , 180' ' 185 . . 190 



580 



(2) INFORMATION FOR SEQ ID NO: 46: 

(i.) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 193 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Gly 
1 5 10 15 

Pro He Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu 
20 25 30 

Glu Asp Gly Val Asn Tyr Ala Thr -Gly Asn Leu Pro Gly Cys Ser Phe 
35 4 0 4 5 

Ser He Phe He Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 55 60 
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i 

Ala Val Pro Tyr Arg Asn Ala Ser Gly lie Tyr His Val Thr; Asn Asp 
65 .70 75' ' ' 80 

Cys Pro Asn Ser Ser lie Val Tyr Glu Ala Asp Asn Leu lie Leu His 
85 , 90 95 

Ala Pro Gly Cys Val Pro Cys Val Met Thr Gly Asn Val Ser »Arg Cys 
100 i 105 " 110 

,1 ' 

\ Tirp Val Gin lie Thr Pro Thr Leu Ser Ala Pro Ser Leu Gly Ala Val 
115 120 . 125 

i i 

Thr Ala Pro lieu Arg Arg Ala Val Asp Tyr Leu Ala Gly Gly Ala Ala 
' 130 135 140 

1 Leu Cys Ser' Ala Leu Tyr Val Gly lAsp Ala Cys Gly Ala Leu t>he Leu 
145 150 " '» 155 ' • 160 

Val Gly Gin i Met Phe Thr Tyr Arg Pro Arg 6ln His Ala T^r Val Gin . 

165 » 170 175 

, » 
Asn Cys Asn Cys Ser lie Tyr Ser Gly His Val Thr Gly His Arg Met ' 
180 185 190 , 

Ala - i 

i - 

i » 
i i 
\ (2) INFOR!MATI ON FOR.SEp ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 58 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(vii) IMMEDIATE SOURCE: 
(B) CLONE: PC-4-6 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 2 . . 580 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

A ACG TGC GGA TTC GCC GAT CTC ATG GGG TAT ATC CCG CTC GTA GGC 46 
Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly 
1 5 10 15 

GGC CCC ATT GGG GGC GTC GCA AGG GCT CTC GCA CAC GGT GTG AGG GTC 94 
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1 JQ i 



i 



Gly Pro He Gly Gly Val A0,a Arg Ala Leu Ala His Gly Val Arg Val i 
20 , 25 30 

CTT GAG 'GAC 6GG GTA T^AC TAT OCA ACA GGG AAT TTA CCC GGT TGp TCT 142 
Leu Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys. Sfcr 
35 40 4s 



TTC TCT ATC TTT ATT CTT GOT CTT CTC TCG TGT C^G ACC GTT CCG GCC 
Phe Ser He Phe lie Leu Ala Leu Leu Ser tys Leu Thr Val Pro Ala 



50 , 55 



60 



i 



CAC GCA CCT GGT TGC GTG OCT TGT , GTC ATG ACA GGT AAT GTG AGT AGA 
His Ala Pro Gly Cys Val Pro Cys Val Met Thr Gly Asn Val Ser Arg 
100 . i, 105; no, 

TGC TGG GTC CAlA ATT ACC CCT ACA CTG TCA GCC CCG AGC CTC GGA GCA 
Cys Trp Val Gin He Thr Pro Thr Leu Ser Ala Pro Ser. Leu Gly Ala 

1^0 125 



TTG GTA GGC CAA ATG TTC ACC TAT AGG CCT CGC CAG CAC GCT ACG GTG 
Leu Val Gly Gin Met Phe Thr Tyr Arg Pro Arg Gin His Ala Thr Val 
160 165 , 170 175 



ATG GCA 
Met Ala 



190 



TCT GCA'GTT.CCC T^C CGA f AAT GCC TCT GGQ ATT TAT CAT GTT ACC AAT 238 
Ser Ala Val Pro Tyr Arg Asn Ala Ser Gly He Tyr His Val Thr Asn 

65 ; 70 75 i 

GAT TGC CCA AAC TCT TCC ATA GTC TAT GAG GCA GAT AAC CTG ATC CTA 286 
Asp Cys Pro Asn Ser Ser He Val Tyr Glu Asp Asn Leu He Leu 

80 l 85 , . , ' 90 95 * 



334 



382 



GTC, ACG GCT CCT CTT CGG AGA GCC GTT GAC TAC CTA GCG GGA GGG GCT 430 
Val Tnr Ala ,Pro Leu Arg Arg Ala Val Asp Tyr Leu Ala Gly Gly Ala 
1 130 .i .135 140 

GCC CTC TGC TCC GCG TTA TAC GTA GGA GAC GCG TGT GGG GCA CTA TTC 478 
Ala Leu Cys Ser Ala Leu Tyr Val Gly Asp Ala Cys Gly Ala Leu Phe 
145 150 155 



526 



CAG AAC TGC AAC TGT TCC . ATT TAC AGT GGC CAT GTT ACC GGC CAC CGG , 574 
Gin Asn Cys Asn Cys Ser He Tyr Ser Gly His Val Thr Gly His Arg 
180 185 iso 



580 



(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 193 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear " 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 
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t 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Gly 
1 5 10 1^ ' 

Pro He Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu 

20 , 25 30 

i 

Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys iSer Phe 
35 ,40 45 

I ♦ 1 ' ' . 

• Ser He Phe He Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 55 . 60 

\ Ala Val Pro Tyr Arg Asn Ala Ser Gly He Tyr His Val Thr Asn Asp 
* »65 70 75 80, 

1 Cys Pro Asn' Ser Ser He Val Tyr telu Ala Asp Asn Leu He Leu His 

85 • • 90 ' 95 

Ala Pro Gly i Cys Val Pro Cys Val Met Thr 6ly Asn Val Ser Arg Cys 
100 ' 105 110 

Trp Val Gin lie Thr Pro Thr Leu Ser Ala Pro Ser Leu Gly Ala Val 

115 120 125 , 

Thr Ala Pro Leu Arg Arg Ala Val Asp Tyr Leu Ala Gly Gly Ala Ala , 

130 135 1 140 

i » 
i 1 . 

\ Leu Cys Ser Ala Leu Tyr Val 1 Gly Asp Ala Cys Gly Ala Leu Phe Leu 

145 \ 150 155 ' ■ 160 

Val Gly Gin Met Phe Thr Tyr Arg Pro Arg Gin His Ala Thr Val Gin 
165 170 175 

Asn Cys Asn Cys Ser He Tyr Ser Gly His Val Thr Gly His Arg Met 
180 185 190 

Ala ' 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single. 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI- SENSE: NO 



(Vii) IMMEDIATE SOURCE: 
(B) CLONE: PC -3 -4 



SUBSTITUTE SHEET (RULE 26) 
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(ix) FEATURE: , ' , 

(A) NAME/KEY: CDS x 

(B) LOCATION : 3 . . 959 , 'i 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: . 

t ' 

CC ATG AGC ACG AAT CCT AAA, CCT CAA AGA AAA ACCJ AAA AGA AAC ACC 4 7 

Met Ser Thr Asn Pro Lys Pro Gin A^g Lys Thr Lys Arg Asn Thr 

» 1 '■ , • 5 10 • 15 

r ■ 1 I 

i 

AAC CGT 1 CGC t CCA QAG GAC ( GTC AAG TTC CCG GGC GGT GGT CAG ATC GTT 95 
Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Vai 

;20 25 30 » 

i ' 1 

GGC GGA GTT TAC TTG TTG CCG CGC AGG GGC CCT AGG ATG GGT GTG CGC 143 
G^y Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Met Gly Val Arg 
35 , , , , 40 45 

i 

GCG ACT CGG AAG ACT TCG GAA CGG ,TCG CAA CCC CGT GGA CGG CGT CAG 191 
Ala Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin 

50 , ( . 55 • ; 60 

CCT ATT CCC A^G GCG CGC CAG CCC ACG GGC CGG TCC TGG GGT CA^ CCC 239 

Pro lie Pro Lys Ala Arg Gin Pro Thr Gly Arg Ser Trp» Gly Gin Pro 
65 / 70 , 75 

i 

GGG. TAC CCT TGG CCC CTT TAC GCC AAT GAG GGC CTC GGG TGG GCA GGG 287 

Gly Tyr Pro .Trp Pro Leu Tyr ^la Asn Glu Gly Leu Gly Trp Ala Gly 

80 f . 85 90 ' ~ 95 

TGG CTG CTC TCC CCT CGA GGC TCT CGG CCT AAT TGG GGC CCC AAT GAC 335 
Trp Leu Leu Ser Pro Arg Gly Ser Arg Pro Asn Trp Gly Pro Asn Asp 
100 105 no 

• • .' 

CCC CGG CGA AAA l^CG CGT AAT TTG GGT AAG GTC ATC GAT ACC CTA ACG 383 
Pro Arg Arg Lys Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu Thr 
115 , 120 125 

TGC GGA TTC GCC GAT CTC .ATG GGG TAT ATC CCG CTC GTA GGC GGC CCC , ■ 431 
Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Gly Pro 
130 135 140 

ATT GGG GGC GTC GCA AGG GCT CTC GCA CAC GGT GTG AGG GTC CTT GAG 479 
He Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu 
145 150 155 

GAC GGG GTA AAC TAT GCA ACA GGG, AAT TTA CCC GGT TGC TCT TTC TCT 527 
Asp Gly Val Asn Tyr Ala Thr Gly- Asn Leu Pro Gly Cys Ser Phe Ser 
160 165 170 ' 175 

ATC TTT ATT CTT GCT CTT CTC TCG TGT CTG ACC GTT CCG GCC TCT GCA 575 
He Phe He Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Ala 
180 185 190 

GTT CCC TAC CGA AAT GCC TCT GGG ATT TAT CAT GTT ACC AAT GAT TGC 623 
Val Pro Tyr Arg Asn Ala Ser Gly He Tyr His Val Thr Asn Asp Cys 
195 200 205 
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FCT/EF94/01323 



CCA AAC TCT TCC ATA GTC TAT GAG GCA GAT &AC CTG ATC CTA CAC GCA 671 
Pro Asn £er Ser lie Val Tyr piu Ala Asp* Asn Leu lie Leu His Ala 
210 ' . 215 220 

CCT GGT ,TGC GTG CCT TGT GTC ATG ACA GGT AAT GTG AGT AGA TGC TGG 719 
Pro Gly Cys Val Pro Cys Val Met Thr Gly Asn Val Ser Arg Cys Trp 
225 236 l 235 . 



GTC »CAA ATT ACC CCT. ACA CTG ,TCA GCC CCO AGC CTC GGA GCA GTC ACG »767 
Val Gin ,Ile Thr Pro Thr Leu Ser Ala Pro Ser Leu Gly Ala Val Thr 
240 ■ ' 245 1 . ' 1 250 255 - 



GCT CCT CTT CGG AGA GCC GTf GAC TAC CTA GCG GGA GGG GCT GCC CTC 815 
Ala Pro Leu Arg Arrg Ala Val Asp Tyr Leu Ala Gly Gly Ala Ala Leu 
260 265 270 

TGC TCC GCG TTA TAC GTA GGA GAC GCG TGT GGG. GCA CTA TTC TTG GTA 863 
Cys Ser Ala Leu Tyr Val Gly Asp Ala ( Cys Gly Ala Leu Phe Leu Val, 
275 1 280 285 

GGC CAA ATG TTC ACC TAT' AG6 CCT CGC CAG CAC GCT ACG GTG CAG AAC 911 
Gly Gin Met Phe Thr Tyr Arg Pro' Arg Gin His Ala Thr Val Glnt Asn 
290 295 300 

TGC AAC TGT TCC ATT TAC AGT GGC CAt GTT ACC GGC CAC CGG ATG GCA 959 
Cys Asn Cys Ser lie Tyr Ser Gly His Val Thr Gly His Arg Met Ala 
'305 310 315 

• I* 1 , • 

(2) INFORMATION FOR SEQ ID NO: 50 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH; 319 amino acids 

(B) TYPE: amino acid 
•(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
1 5 10* 15 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg ; Gly Pro Arg Met Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 . 55 60 

lie Pro Lys Ala Arg Gin Pro Thirdly Arg Ser Trp Gly Gin Pro Gly 
65 70 75 80 

Tyr Pro Trp Pro Leu Tyr Ala Asn Glu Gly Leu Gly Trp Ala Gly Trp 
85 90 95 
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Leu Leu Ser Pro Arg Gly Ser Arg Pro Asn Trp Gly Pro Asn Asp Pro 

100 105 11D i 1 * 

Arg Arg Lys Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu Thr Cys 

115 120 125 

1 i 

Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Led Val Gly Gly, Pro lie 
130 ,135 140 

Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 
145 150 , 155 160 

.', .. Gly Val Asn Tyr Ala Thr Gly Asn Leu' Pro Gly Cys Ser, Phe Ser lie 
\ \ 165. . 170 175 

Phe lie Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Val 
1 ' 180 . ' 185 190 • 

Pro Tyr Arg Asn Ala Ser Gly lie Tyr His Val Thr Asn Asp Cys Pro 
195 200 ' 205 

Asn per Ser lie Val Tyr Glu Ala Asp Asn Leu lie Leu His, Ala Pro 
210 215 220 

Gly Cys Val Pro Cys Val Met Thr Gly Asn Val Ser. Arg Cys Trp Val ' 
225 230 235 240 

Gin lie Thr Pro Thr Leu Ser Ala Pro Ser Leu Gly Ala Val Thr Ala 
\ ' i . 245 . 250 . ' 255 

• * ■ ■ 

Pro Leu Arg Arg Ala Val Asp Tyr Leu Ala Gly Gly Ala Ala Leu Cys 
260 265 270 

Ser Ala Leu Tyr Val Gly Asp Ala Cys Gly Ala Leu Phe Leu Val Gly 
275 280 ' 285 

Gin Met Phe Thr Tyr Arg Pro Arg Gin His Ala Thr Val Gin Asn Cys 
290 295 300 

Asn Cys Ser lie Tyr Ser Gly His Val Thr Gly His Arg Met Ala 
'305 310 315 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(vii) IMMEDIATE SOURCE: 
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(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 3- .959 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 



CC ATG AGC flCG AAT CCT AAA CCT CAA AGA AAA ACC AAA AGA > AAC ACC 
■ Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr 
1 5 ,10 15 



47 



AAC CGT CGC CCA CAG GAC GTC AAG TTC CCG GGC GGT GGT CAG ATC GTT 
Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val 
20 25 30 



95 



GGC GGA GTT TAC TTG TTG CCG CGC AGG GGC CCT AGG ATG GGT GTG CGC 
Gly Gly .Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Met Gly Val Arg 
i 35 40 1 45 ' 

GCG A£T CGb AAG ACT TCG GAA CGG TCG CAA CCC CGT GGA CGG CGT CAG 
Ala Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin 
50 55 60 



143 



191 



CCT ATT CCC AAG GCG CGC CAG CCC ACG GGC CGG TCC TGG GGT CAA CCC ,. 23 9 

Pro lie Pro Lys Ala Arg Gin Pro Thr Gly Arg Ser Trp Gly Gin Pro 
65 ' 70 75 

i i > 

GGG TAC C<iT TGG CCC CTC 4 TAC GCC AAT GAG GGC CTC GGG TGG GGA GGG 287 

Gly Tyr Pro Trp Pro Leu Tyr Ala Asn Glu Gly" Leu Gly Trp Ala Gly 
80 85 90 95 



TGG CTG CTC TCC CCT CGA GGC TCT CGG CCT AAT TGG GGC CCC AAT GAC 
Trp Leu Leu Ser Pro Arg Gly Ser Arg Pro Asn Trp Gly Pro Asn Asp 
100 105 110 



33 5 



CCC CGG CGA AAA TCG CGT AAT TTG GGT AAG GTC ATC GAT ACC CTA ACG 
Pro Arg Arg Lys Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu Thr 
115 . 120 125 



383 



TGC GGA ( TTC GCC GAT CTC ATG GGG TAC ATC CCG CTC GTA GGC GGC CCC 
Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Gly Pro 
130 135 140 



431 



GTT GGG GGC GTC GCA AGG GCT CTC GCA CAC GGT GTG AGG GTC CTT GAG 
Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu 
145 150 155 



479 



GAC GGG GTA AAC TAT CCA ACA GGG AAT TTA CCC GGT TGC TCT TTC TCT 
Asp Gly Val Asn Tyr Pro Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser 
160 165 170 175 



527 



ATC TTT ATT CTT GCT CTT CTC TCG -TGT CTG ACC GTT CCG GCC TCT GCA 
lie Phe He Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Ala 
180 185 190 



575 



GTT CCC TAC CGA AAT GCC TCT GGG ATT TAT CAT GTT ACC AAT GAT TGC 



623 
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Val Pro Tyr Arg Asn Ala Ser Gly iXe Tyr His Val' Thr Asn Asp Cys 
195 - 200 205 

CCA AAC ,TCT TCC ATA GTC TAT ,pAG GCA GAT AAC CTG ATC CTA CAC GCA 671 
Pro Asn Ser Ser He Val .Tyr Glu Ala Asp Asn Leu lie Leu His Ala ' 
210 215 220 

CCT GGT TGC GTG CCT TGT GTp ATG ACA GGT AAT GTG AGT AGA TGC TGG 719 
Pro Gly cys Val Pro Cys Val Met Thr Gly Asn Val Ser Arg Cys Trp 
,225 , . 230 235 



GCT CCT CTT CGG AGA GCC GTT GAC TAC CTA GCG GGA GGG GCT GCC CTC 
Ala Pro Leu Arg Arg Ala Val Asp Tyr Leu Ala Gly Gly Ala Ala Leu 

260 ,' , 265 ■ 270 



TGC AAC TGT TCC ATT TAC AGT GGC CAT GTT ACC GGC CAC CGG ATG GCA 
Cys Asn Cys Ser He Tyr Ser Gly His Val Thr Gly His Arg Met Ala 
305, i- , 3ip 3 , 15 

(2) INFORMATION' FOR SEQ ID NO: 52: 

. (i). SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 319 amino acids 
<B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
1 5 10 is 

Ar& Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 

20 . , 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Met Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50„ 55 6 0 

He Pro Lys Ala Arg Gin Pro Thr Gly Arg Ser Trp Gly Gin Pro Gly 
65 70 75 80 



i 



GTC CAA'ATT ACC CCT ACA CTG TCA GCC CCGj AGC CTC GGA GCA GTC ACG ' 767 

Val Gin lie Thr Pro Thr 'Leu Ser Ala Pro Ser Leu Gly Ala Val Thr' 
240 : 245 250 255 



815 



TGC TCC GCG TTA TAC GTA qGA GAC GCG TGT GGG GCA CTA TTC TTG GTA 863 
Cys Ser Ala Leu Tyr Val Gly Asp 'Ala Cys Gly Ala Leu Phe Leu Val 
275 , , • 280 • ; 285 

GGC CAA ATG 'TTC ACC TAT AGG CCT CGC CAG CAC GCT ACG GTG CAG AAC 
Gly Gin Met Phe Thr Tyr Arg Pro Arg Gin His Ala Thr, Val Gin Asn 
290 .' 295 ■ 300 



911 



959 



BNSDOCID: <WO 942S601A2_I_> 



SUBSTITUTE SHEET (RULE 26) 



WO 94/25601 FCT/EP94/01323 

143 , ' 

Tyr Pro Trp Pro Leu Tyr Ala Asn Glu Gly Leu Gly Trp Ala Gly Trp ' 
85 90 * '95 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Asn. Trp Gly Pro Asn Asp Pro 

100 105 110 

i 

i 

Arg Arg Lys Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu Thr Cys 

115 1 120 , i 1?5 

t * 

Gly* Phe Ala Asp Leu. Met Gly ,.Tyr lie Pro Leu Val Gly Gly Pro Val i 
130, , 135 140 

1 / i 

Gly Gly Val, Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp , 
145 ; 150 , 155 160 

Gly Val Asn Tyr Pro Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser lie 
165 170 , ' " 175 

Phe lie Leu Ala Leu Leu Ser Cys Leu, Thr Val Pro Ala Ser Ala Val • 
180 1 185 190 

Pro Tyr Arg Asn Ala Ser 1 Gly lie Tyr His Val Thr Ash Asp Cys bro 
195 . , ( 200* . 205 i 

Asn Ser Ser lie Val Tyr Glu Ala Asp Asn Leu lie Leu His Ala Pro 
210 • 215 '' 220 

Gly 'Cys Val Pro Cys Val Met Thr Gly Asn Val Ser Arg Cys Trp Val 

225 ♦ „ 230 '■' ' 235 240 

■ i ' 1 

Gin lie Thr Pro Thr Leu Ser Ala Pro Ser Leu Gly Ala Val Thr Ala 
245 250 255 

Pro Leu Arg Arg Ala Val Asp Tyr Leu Ala Gly Gly Ala Ala Leu Cys 
260 , 265 270 

Ser Ala Leu Tyr Val Gly Asp Ala Cys Gly Ala Leu Phe Leu Val Gly 
275 " 280 285 

Gin Met Phe Thr Tyr Arg Pro Arg Gin His Ala Thr Val Gin Asn Cys 
290 295 300 

Asn Cys Ser lie Tyr Ser Gly His Val Thr' Gly His Arg Met Ala 
305 310 315 



(2) INFORMATION FOR SEQ ID NO ; 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 959 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 
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(iii) ANTI-SENSE: NO 

(vii) IMMEDIATE SOURCE: 

<B) CLONE: PC C/El 

i 

(ix) FEATURE: 1 ' 

(A) NAME /KEY: CDS 

(B) LOCATION: 2.. 959 ■■ ' 

i 



(xi) SEQUENCE) DESCRIPTION: SEQ ID NO: 53: 



CCATGAGCAC 


G^ATCCTAAA CCTCAAAGAA AAACCAAAAG AAACACCAAC 


CGTCGCCCAC 


60 


AGGACGTCAA 


GTTCCCGGGC GGTGGTCAGA 

i 


TCGTTGGCGG AGTTTACTTG 


TTGCCGCGCA 


120 


GGGGCCCTAG 


* * i 
GATGGGTGTG CGCGCGACTjC GGAAGACTTC . GGAJVCfcGTCG CAACCCCGTG 


180 


GACGGCGTCA 


,GCCTATTCCC AAGGCGCGCC AGCCCAOGGG CCGGTCCTGG 


GGTCAACCCG 


240 


GGTACCCTTG 


GCCCCTTTAC GCCAATGAGG 


GCCTCGGGTG GGCAGGGTGG 


jCTGCTCTCCC 


300 


CTCGAGGCTC 


TCGGCCTAAT TGGGGCdCCA ATGACCCCCG G^GAAAATCG 


CGTAATTTGG 


i 

360 


gtaaggt'cat 


CGATACCCTA ACGTGCQGAT TCGCCGATCT CATGGGGTAY 


ATCCCGCTCG 


420 


TAGGCGGCCC 


i 

CRTTGGGGGC GTCGCAAGGG 


CTCTCGCACA CGGTGTGAGG 


GTCCTTGAGG 


480 


ACGGGGTAAA 


i 

CTATSCAACA ' GGGAATTTAC 


CCGGTTGCTC TTTCTCTATC 


TTTATTCTTG 


540 


CTCTTCTCTC 


GTGTCTGACC GTTCCGGCCT 


CTGCAGTTCC CTACCGAAAT 


GCCTCTGGGA 


600 


TTTAT CATGT 


TACCAATGAT TGCCCAAACT 


CTTCCATAGT CTATGAGGCA 


GATAACCTGA 


660 


TCCTACACGC 


ACCTGGTTGC GTGCCTTGTG 


TCATGACAGG TAATGTGAGT AGATG CTGGG 


720 


TCCAAATTAC 


CCCTACACTG TCAGCCCCGA 


GCCTCGGAGC . AGTCACGGCT 


CCTCTTCGGA 


780 


GAGCCGTTGA 


CTACCTAGCG GGAGGGGCTG 


CCCTCTGCTC CGCGTTATAC 


GTAGGAGACG 


840 


CGTGTGGGGC 


ACTATTCTTG GTAGGCCAAA 


TGTTCACCTA TAGGCCTCGC 


CAGCACGCTA 


900 


CGGTGCAGAA 


CTGCAACTGT TCCATTTACA 


GTGGCCATGT TACCGGCCAC 


CGGATGGCA 


959 



(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 319 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
1 5 10 15 
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i 

i 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Giy Gly Gin lie Val Gly 

20 25 " 1 ' 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Met Gly Val Arg Ala 
, 35 40 '45 

Thr Arg Lys Thr Ser Glu Arg 1 Ser Gin Pr,o Arg Gly Arg Arg Gin Pro 
50 55 . 60 

• i • 

lie Pro jLys Ala Arg Gin Pro Thr Gly Arg Ser Trp Gly Gin Pro Gly 

65 ■ ' • 70 , ' 1 75 ,80 , 

i 

Tyr Pro Trp Pro Leu Tyr Ala, Asn Glu Gly Leu Gly Trp Ala Gly Trp 
85 30 95 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Asn Trp Gly Pro Asn Asp Pro 

100 l ' 105 110 

i 

■ ' ■ 

Arg Arg Lys Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu Thr Cys 
115 120 125 

Gly Phe Ala As^> Leu Met Gly Tyr. lie Pro Leu Val Gly Gly Pro, Val 
130 ' 135 140 

i • 

Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg VaJL Leu Glu Asp 

145 ' 1 150 155 160 ' ' 

i 

Gly Val Asn Tyr Pro Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser lie 
165 170 t ' , 175 

Phe lie Leu Ala .Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Val 
180 185 190 

Pro Tyr' Arg Asn A^a Ser Gly lie Tyr His Val Thr Asn Asp Cys Pro 
195 200 205 

Asn Ser Ser lie Val Tyr Glu' Ala Asp Asn Leu lie Leu His Ala Pro 
210 215 220 

Gly Cys Val Pro Cys Val Met Thr Gly Asn Val Ser Arg Cys Trp Val 
225 230 235 240 

Gin lie Thr Pro. Thr Leu Ser Ala Pro Ser Leu Gly Ala Val Thr Ala 
245 250 255 

Pro Leu Arg Arg Ala Val Asp Tyr Leu Ala Gly Gly Ala Ala Leu Cys 
260 £65 270 

Ser Ala Leu Tyr Val Gly Asp Ala Cys Gly Ala Leu Phe Leu Val Gly 
275 280 285 

Gin Met Phe Thr Tyr Arg Pro Arg Gin His Ala Thr Val Gin Asn Cys 
290 295 300 

Asn Cys Ser lie Tyr Ser Gly His Val Thr Gly His Arg Met Ala 
305 310 315 
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(2) INFORMATION FOR SEQ^ID NO,: 55: 

(i) SEQUENCE CHARACTERISTICS: ' ' 

(A) LENGTH: 354 base pairs . 

(B) TYPE: nucleic acid. 

(C) STRANDED^ESS : single 

(D) TOPOLOGY: linear 

j ' (ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO ' 

i 1 

(iii) ANTJ- SENSE: NO 

t 

1 (vii) IMMEDIATE , SOURCE : . 

(B) CLONE: PC-1^37 , 

(ix) FEATURE: ' , 

(A) NAME/KEY: CDS 
' (B) LOCATION: 1. .354 

i 

. 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO : 55 : 
ACCACCGGAG CTTCTATCAC ATACTOCACT TACGGCAAGT TCCTTGCTGA TGGAGGGTGT 
j TCAGGCG6CG CGCATGACGT GATCATATGC GACGAGTGCC , ATTCCCAGGA CGCCACC^CC 
ATTCTTGGGA TAGGCACTGT CCTTGACCAG GCAGAGACGG CTGGAGCTAG GCTCGTCGTC 
TTGGCCACGG NCACCCCTCC CGGCAGTGTG ACAACGCCCC ACCCCAACAT CGAGGAAGTG 
GCCCTGCCTC AGGAGGGGGA GGTTCCCTTC TACGGCAGAG CCATTCCCCT TGCTTTTATA 
AAGGGTGGTA GGCATCTCAT CTTCTGCCAT TCCAAGAAAA ATTGTGATGA ACTC 

(2) INFORMATION FOR SEQ ID NO: 56: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 118 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

Thr Thr. Gly Ala Ser He Thr- Tyr Ser Thr Tyr Gly Lys Phe Leu Ala 

5 10 15 

Asp Gly Gly Cys Ser Gly Gly Ala His Asp Val lie He Cys Asp Glu 



60 
120 
180 
240 
300 
354 



30 
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Cys His Ser Gin Asp Ala Thr Thr lie "Leu Gly He Gly Thr Val Leu 
35 40 45 

Asp Gin Ala Glu Thr. Ala Gly Ala Arg Leu Val Val Leu Ala Thr Xaa 
50 , 55 60 

Thr Pro Pro Gly Ser Val Thr Thr Pro His Pro Asn lie Glu Glu Val 
65 .70 75 80 

Ala Leu Pro, (31n Glu Gly Glu Val Pro Phe Tyr Gly Arg Ala He Pro t 
85 .90 95 

Leu Ala 'Phe He Lys Gly Gly Arg His Leu lie Phe Cys His Ser Lys 



100 105 

Lys Asn Cys Asp Glu Leu 1 
115 ' S 

(2) INFORMATION FOR SEQ ID NO: 57: 
U) SEQUENCE CHARACTERISTICS: 

1 (A) LENGTH: 354 base pairs ' 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA! 

i 

(iiij 'HYPOTHETICAL:, NO 1 
~ (iii) ANTI-SENSE: NO 



110 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: PC- 1-48 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..354 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

ACCACCGGAG CTTCTATCAC ATACTCCACT TACGGCAAGT TCCTTGCTGA TGGAGGGTGT 60 

TCAGGCGGCG CGTATGACGT GATCATATGC GACGAGTGCC ATTCCCAGGA CGCCACCACC 120 

ATTCTTGGGA TAGGCACTGT CCTTGACCAG GCAGAGACGG CTGGAGCTAG GCTCGTCGTC 180 

TTGGNCACGG NCACCCCTCC CGGCAGTGTG ACAACGCCCC ACCCCAACAT CGAGGAAGTG 240 

GCCCTGCCTC AGGAGGGGGA GGTTCCCTTC TACGGNAGAG CCATTCCCCT TGCTTTTATA 3 00 

AAGGGTGGTA GGCATCTCAT CTTCTGCCAT TCCAAGAAAA AATGTGATGA ACTT 3 54 
(2) INFORMATION FOR SEQ ID NO: 58: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 13 3 amino acids ' , 

(B) TYPE: amino' acid 

(C) STRANDEDNESS: single ( 1 

(D) TOPOLOGY: linear 

' • . ■ 1 

(ii) MOLECULE TYPE: protein 

, . . .I-' 1 , 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: , 

' ' r ■ . » l 

Thr Thf Gly Ala Ser He , Thr Tyr S<=sr Thr Tyr Gly Lys Phe Leu Ala 

i . ' 5 1 ; 10 i$ 

Asp Gly Gly ,Cys Ser Gly Gly Ala Tyr Asp Val' He He Cys Asp Glu 
20 25 30 

Cys His Seaf Gin Asp Ala.Thr Thr He' Leu Gly lie Gly Thr Val Leu 
35 | 40 45 

Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Xaa Thr Xaa 

so . , ■" ■ 55 ■ ; 

Thr Prb Pro Gly, Ser Val Thr Thr Pro His Pro Asn lie Glu Glu Val 
65 70 75 , . 80 

i 

Ala Leu ( IPro Gin Glu Gly Glu Val Pro Phe Tyr 'Xaa Arg Ala He Pf o 
85 90 , 95 

L«*u Alat Phe He Lys Gly Gly Arg His Leu . He Phe Cys His Ser Lys 

100 105 ' no 

Lys Lys Cys Asp Glu Leu Arg Gin Ala Thr Asp Gin Pro Gly Arg Glu 

115 120 125 

A^rg Pro Trp Glu Tyr , 
130 • 



(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 357 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA ' 
(iii) HYPOTHETICAL: NO ' ; 

(iii) ANTI-SENSE: NO 



(vii) IMMEDIATE SOtmCE : 

(B) CLONE: PC-1-37 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
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(B) LOCATION: 1.J357 ' 

(xi) 'SEQUENCE DESCRIPTION: SEQ ID NO: 59: " 1 ■ 

ATGGQTTT<pA TGTCTCCGGA CTTGGAGGTC ATTACCANCA CTTGGGTTCT GGTGGGGGGC 6 0 

GTTGTGGCGA CCCTGNCGNC CTAClfaCTTG ACGG^GGGfTT CGGTAGCCAT AGTCGGTAGG 12 0 

ATCATCCTCT CTGGGAAACC TGCCATpATT NCCGATAGGG AGGTATTATA CCAGCAATTT 180 

GATGAGATGG AGGAGTGCTC GGCCTCGT^G CCCTATATGG ACGAAACACG TNCCAT.TGCC 24 0 

GGACAATTCA AAGAGAAAGT GCTCGpCTTC ATCAGCACGA CCGGCCAGAA GGCTGAAACT 3 00 

CTGAAGCCGG CAGCCACGTC TGTGTGGAAC AAGGCTGATC AGTT CTGGNC CACATAC 357 

»' • ' 

(2) INFORMATION FOR SEQ ID NO:' '60: 

i 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 128 amino acids 

(B) TYPE: amino acid » 1 
(C\ ^TRANDEDNESS : single , 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 1 



(xi)' SEQUENCE DESCRIPTION: SEQ ID NO,: 60: 

Met Ala Phe. Met Ser Pro Asp Leu Glu Val lie Thr Xaa Thr Trp Val 
1 5 10 15 

Leu Val Gly qiy Val Val * Ala ' Thr Leu Xaa Xaa Tyr Cys Leu Thr Val 
20 25 30 

Gly Ser Val Ala lie Val Gly Arg lie lie Leu Ser Gly Lys Pro Ala 
35 40 45 

lie lie Xaa Asp Arg Glu Val Leu Tyr Gin Gin Phe Asp Glu Met Glu 

50 55 60 

Glu Cys Ser Ala Ser Leu Pro Tyr Met Asp Glu Thr Arg Xaa lie Ala 
65 70 • 75 80 

. Gly Gin Phe Lys Glu Lys Val Leu Gly Phe He Ser Thr Thr Gly Gin 

85 90 95 

Lys Ala Glu Thr Leu Lys Pro' Ala Ala Thr Ser Val Trp Asn Lys Ala 
100 105 110 

Asp Gin Phe Trp Xaa Thr Tyr Met Trp Asn Phe lie Ser Gly He Gin 
115 120 125 



(2) INFORMATION FOR SEQ ID NO: 61: 
(i) SEQUENCE CHARACTERISTICS: 
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j| (A) LENGTH: 357 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single ' ' ' 1 ' * 

(D) TOPOLOGY: linear , 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO i 
\ ' (iii) ANJTI- SENSE 1 : NO \ i 



' 1 



(vii) IMMEDIATE SOURCE: 

(B)' CLONE: PC- 1-4 8 

i 

(ix) FEATURE: 

(A) NAME/KEY: CDS ' 

(B) LOCATION: 1..357 ,( » 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 61: 

■ i" ... 

i . 

ATGGCTTGCA TGTCTGCGGA CCTGGAGGTC ATTACCANCA CTTGGGTTCT GGTGGGGGGC 60 
GTTGTGGCGN CCCTGGCGGC CTACTGCTTG ACGGTGGGTT CGGTAGCCAT AGTCGGTAGG ( 120 

ATCATCCTCT CTGGGAAACC TGCCATCATT CCCGATAGGG AGGCA*TATA CCANCAATTT, 180 

» 

GATGAGATGG AGGAGTGC*rC GGCCTCGTTG CCCTATATGG A,CGAGACACG TGCCATTGCC 24 0 

GGACAATTCA , AAGAGAAAGT GCTCGGCTTC ATCAGCACGA CCCsGCCAGAA GGCTGAAACT 300 

CTGAAGCCGG CAGCCACGTC TGTGTGGAAC AAGGCTGANC AGTTCTGGGC CACATAC 357 

(2) INFORMATION FOR SEQ ID NO : 62 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 128 amino acids 
<B) TYPE: amino acid 
' (C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii), MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 62: 

Met Ala Cys Met Ser Ala Asp Leu Glu Val lie Thr Xaa Thr Trp Val 
1 5 10 15 

Leu Val Gly Gly Val Val Ala Xaa Leu Ala Ala Tyr Cys Leu Thr Val 
20 25 30 

Gly Ser Val Ala lie Val Gly^ Arg lie lie Leu Ser Gly Lys Pro Ala 
35 40 45 

He He Pro Asp Arg Glu Ala Leu Tyr Xaa Gin Phe Asp Glu Met Glu 
50 55 60 
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Glu Cys Ser Ala Ser Leu Pro Tyr Met -Asp Glu Thr Arg Ala lie Ala 
65 70 ' 75 ' ' ' ' 80 ' 

Gly Gin Phe Lys Glu ,Lys Val Leu Gly Phe lie Ser Thr Thr Gly Gin 

85 90 95 

i 

Lys Ala Glu Thr Leu Lys Pro Ala Ala Thr Ser Val Trp iAsn Lys Ala 
100 . , 105 110 

\ i I 

' Xaa Gin Phe Trp Ala Thr Tyr Met Trp Asn Phe lie Ser Gly lie Gin 

115 120 , 125 

it 

i , • - 

(2) INFORMATION FOR SEQ ID NO: 63: 
i 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 1 

(B) TYPE: nucleic acid'i » 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: 'linear 

• • 

(i±) M6LECULE TYPE: DNA (genomic) • i 

i 

(iii) HYPOTHETICAL: YES 
(iii) ANTI-SENSE: NO 



i i 
\ (ix) FEATURE: ' 1 

^(^) NAME /KEY:' mis cofeature ' , 

(B) LOCATION: 1..28 

(D) OTHER INFORMATION: /standard_name= "HCV Primer 
HCPrl61 M 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 



ACCGGAGGCC AGGAGAGTGA TCTCCTCC 
(2) INFORMATION FOR SEQ ID NO: 64: 

(i), SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

{ C ) STRANDEDNES S : s ingl e 
(D) TOPOLOGY: linear 



28 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: YES 
(iii) ANTI-SENSE: YES 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 1. .28 

(D) OTHER INFORMATION: /standard_name= "HCV Primer 
HCPrl62" 
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(xi) SEQUENCE DESCRIPTION: SEQ ID ^Oj 64: 
GGGCTGCTCT ATCCTCATCG ACGCCATC ' ' 2 8 



(2) 'INFORMATION FOR SEQ ID NO : 65: 
• . • i 
• (i) SEQUENCE CHARACTERISTICS: ' 
, (A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) .ST^ANDEDNESS : Single 

(D) TOPOLOGY: linear 

( i i ) MOLECULE TYPE : Dlta ( genomi c ) 

, (iii) HYPOTHETICAL: YES 

I ■ 
(iii) ANTI-SENSE: NO' 



(ix) FEATURE: , , • , 

(A) NAME/KEY: miscjeature 

(B) 'LOCATION: 1..28 1 
(D) OTHER INFORMATION: /standard_name= "HCV Primer 

HCPrl63" ., 

( i 

, (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65:' 

i i • ' 

GCCAGAGGCT C&GAiAGGCGA TCAGCGCT 



(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) S TRANDEDNES S, single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: YES 
(iii) ANTI-SENSE: YES 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 1..28 > 

(D) OTHER INFORMATION: /standard_name= "HCV Primer 
HCPrl64" 



(xi)_ SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
GAGCTGCTCT GTCCTCCTCG ACGCCGCA 
(2) INFORMATION FOR SEQ ID NO: 67: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid • 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: , linear 

(ii) MOLECULE TYPE: DNA (genomic) i 

(iii) HYPOTHETICAL: YES ' i 

(iii) ANTI-SENSE: NO , . 

♦ i 
» i 

(ix) FEATURE: 

(A) NAME /KEY : misc_f eature 

(B) LOCATION: 1..28 

(li) OTHER INFORMATION: Vstandard_name= M HCV Primer 
HCPr23" *• • 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

CTCATGGGGT ACATTCCGCT 

(2) INFORMATION FOR SEQ ID NO : t 68: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 27 base pairs 
\ 1 (B) TYPE: nucleic acid [ l 

{ (C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: YES 

(iii) ANTI-SENSE: YES 

(ix) FEATURE: 

(A) NAME /KEY : misc_f eature 

(B) LOCATION: 1..28 
(D) OTHER INFORMATION: /standard_name= "HCV Primer 

HCPr54" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 68 : 
CTATTACCAG TTCATCATCA TATCCCA 27 
(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



i 

20 
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(iii) HYPOTHETICAL: YES 
(iii)'' ANTI- SENSE: .NO 



(ix) FEATURE: 

(A) NAME/KEY: misc_feature ' 
. (B) LOCATION: 1A28 

(D) OTHER INFORMATION: /standard name= ' »HCV Primer 
♦ HCPrll6» ~ primer 



i 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 
TTTTAAATAC ATCATG^CTG YATG 
(2) INFORMATION FOR SEQ ID NO: 70: 



24 



1 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 'base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEBNESS: single 

(I?) TOPOLOGY: linear, 

i 

) 

(ii) MOLECULE TYPE: DNA (genomic) 
• -i 
(iii) HYPOTHETICAL: YES ' 

i 1 1 

(iii) ANTI-SENSE: YES 

• t i 

(ix) FEATURE: 

(A) N,AME/KEY: misc_feature 

(B) LOCATION: 1. .28 

(D) OTHER INFORMATION: /standard name= »HCV Primer 
HCPr66» ' ~ xmer 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 
CTATTATTGT ATCCCRCTGA TGAARTTCCA CAT 

■ 33 

(2) INFORMATION FOR SEQ ID NO: 71 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base- pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear ' 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: YES 

(iii) ANT I - SENSE : YES 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 
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(B) LOCATION: I. 4 . 28 

(D) OTHER INFORMATION: /standard_nam'e= "HCV Primer 
HCPrll8: ' 1 



(xi), SEQUENCE DESCRIPTION: SEQ ID NO: 71: • 
ACTAGTCGAC TAYTGATCCR CTATliWARTT CCAC^T | ', 3 6 

(2) * INFORMATION FOR. SEQ ID *JO : 72 : i . i 

(i)' SEQUENCE /CHARACTERISTICS: 1 ... 

(A) LENGTH: 2 5 base pairs ' , 

(B) TYPE: nucleic acid • 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: YEfe ' . 

(iii) ANTI -SENSE:. NO' ' 

» • ■ 

(ix) FEATURE: 

(A) NAME/KEY: misc_f eature , 

(B ) 1 LOCATION : 1..28 ( 

(D) OTHER INFORMATION: /standard_name= "HCV Primer 
' „ HCPrll7: 

i 

.i 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 
TTTTAAATAC ATCGCRCTGC ATGCA . 25 
(2) INFORMATION FQR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 36 ba'se' pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: YES 
(iii) ANTI-SENSE: YES 

(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 1..28 

(D) OTHER INFORMATION: /standard_name= "HCV Primer 
HCPrll9: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 
ACTAGTCGAC TARTTGCATA GCCKRTTCAT CCAYTG 36 



BNSDOCID: <WO 9425601 A2_L> 



SUBSTITUTE SHEET (RULE 26) 



WO 94/25601 



156 



PCT/EP94/01323 



(2) INFORMATION FOR SEQ ID NOr^. 74: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNE$S: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: YES 
(iii) ANTI-SENSE: NO 



(ix) FEATURE : , 

(A) NAME /KEY : misc_f eature 

(B) LOCATION: 1. .28 

(D) OTHER INFORMATION: /standard_name= "HCV Primer 

HCPrl31: 

• » 

i , 

(xi) SEQUENCE DESCRIPTION: SEQ ID* NO: 74: 1 

GGAATTCTAG ACCTCTGGGA YGARAYTGGA ARTG 

(2) INFORMATION FOR SEQ ID Np: 75: 

(i)» SEQUENCE CHARACTERISTICS: 

x (A) LENGTH: 31 base pairs ( 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: YES 
(iii) ANTI- SENSE: NO 



• 34 



(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 1..28 

(D) OTHER INFORMATION: /standard_name= "HCV Primer 
HCPrl3 0: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 
GGAATTCTAG ACGCTAYCAR GCACGTTGYG C 



31 



(2) INFORMATION FOR SEQ ID NO: 7* : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 
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i 

ip (G) STRANDEDNESS : single . 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 

(iii) HYPOTHETICAL: YES 

(iii) ANTI-SENSE: NO , 

i ' 

\ ' . i " ■ . 

1 (ix) FEATURE: 



"I 



(A) NAME/KEY: misc_f eature , , 

(B) , LOCATION: 1.'.28 

(D)i OTHER INFORMATION: /standard_name= "HCV Primer 
HCPrl34 : 



(xi) SEQUENCE DESCRIPTION: &EQ ID NO: 16: » 

CATATAGATG CCCACTTCCT ATC ' 23 

» i 

(2) i;NFOR^TION FOR SEQ ID NO: 77: i , 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid , . 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MpLECULE TYPE: DNA (genomic) ' . 

(iii) HYPOTHETICAL: YES 
(iii) ANTI-SENSE: YES 



(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 1. .28 

(D) OTHER INFORMATION: /standard_name= "HCV Primer 
HCPr3 : 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 
GTGTGCCAGG ACCATC 16 
(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(iii) HYPOTHETICAL : YES 
<iii) AtfTI- SENSE; YES h 



(ix)* FEATURE: ' 

(A) NAME /KEY: misc_f eature 

(B) LOCATION: , 1. .28 1 1 

(D) OTHER INFORMATION: /standard name, "HCV Primer 
HCPr4: ' ~ 



PCT/EP94/01323 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

i I 1 i 

GACATGCATG TCATGATGTA 

♦ 

(2) INFORMATION FOR SEQ ID NO: 79: 

i 

(i) SEQUENCE CHARACTERISTICS': 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid , 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 1 

i • 

(ii) MOI^CULE TYPE: DNA (genomic) 

i i ■ • 
• (iii) HYPOTHETICAL: NO ' ■ ' ' 

» 

(iii) ANtI -*SENSE : NO . 

.i ' » ■ 

(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 1..28 ■' 

(D). O+HER INFORMATION: /standard_name= "HCV Primer 
HCPrl52 : 

t* . 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 
TACGCCTCTT CTATATCGGT TGGGGCCTG 

2 9 

(2) • INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

( i i ) MOLECULE TYPE : DNA ( genomi c ) 
(iii) HYPOTHETICAL: YES 
(iii) ANTI-SENSE: NO 
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(ix) FEATURE: 1 

(A) NAME/KEY: misc_feature * 
h (B). LOCATION: 1. .2? ' ' 

(D) OTHER INFORMATION: /standard_name= "HCV Primer • 
HCPr52: . . • ' 

t 

(xi) SEQUENCE DESCRIPTION: SEQ ID,NO:| 60: ■" 

i 

ATGTTTGGGTA AGGTCATCGA TACCCi;, i ' i 26 

♦ 

(2) INFORMATION FOR SEQ |ID NO: 81: 1 . 

(i) SEQUENCE' CHARACTERISTICS: . ■ .' 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: lirieaf 1 

i 

(ii) MOLECULE TYPE: Dl&A (genomic) 
(iii) HYPOTHETICAL: 1 YES 1 
(iii) ANTI-SENSE: NO 



( ix ) FEATURE : ' 

(A) NAME/KEY: mis cofeature 
(B* LOCATION: 1..2 8' 

(D) OTHER INFORMATION: /standard_name= "HCV Primer 
HCPr41: 



(xi) SEQUENCE DESCRIPTION:' SEQ ID NO: 81: 
CCCGGGAGGT- CTCGTAGACC GTGCA 25 
(2) INFORMATION FOR SEQ ID ftO* 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) ' TOPOLOGY: linear 

. (ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: YES ■ ' } 

(iii) ANTI- SENSE: YES 



(ix) FEATURE: 

(A) NAME /KEY : misc_f eature 

(B) LOCATION: 1..28 

(D) OTHER INFORMATION: /standard_name= "HCV Primer 
HCPr4 0: 
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(xi) SEQUENCE DESCRIPTION.: SEQ ID NO: 82: 
CTATTAAAGA TAGAGAAAGA GCAACCGGG. 



29 



"1 



(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 
• ' (A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNES& : single ' 
(DD TOPOLOGY: linear 

i 

' (ii) MOLECULE TYPE: peptide 
1 « 1 

(iii) HYPOTHETICAL: NO • '» 



(viii) POSITION IN PROTEIN: 

1 (B) MAP POSITION: positions 192 to 203 of the VI region of HCV 
type 3 . 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 
Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val Leu 



10 



(2) INFORMATION FOR S±Q ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

' (ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(viii) POSITION IN PROTEIN: 

(B) MAP POSITION: positions 192 to 203 of the VI region of HCV 

type 5 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 

Val Pro Tyr Arg Asn Ala Ser Gly lie Tyr His Val 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 

i 

(ii) MOLECULE TYPE: peptide M 
(iii) HYPOTHETICAL: NO 

i 

♦ 

(viii) POSITION IN PROTEIN: 1 , I »' 

'(B) MAP POSITION: positions 213 to 223 of the V2 region ofHCV 
, tyP e 3 • '' . ' . i ' • t ■ 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: , 

» 

Val Tyr Glu Ala Asp Asp Val lie Leu His Thr 
1 5 10 

(2) INFORMATION FOR 'SEQ ID NO :' 86:' 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino 1 acid ■ 

(C) STR^NDEDNESS : single- i 

(D) TOPOLOGY: linear 

(ii) MOLECULE 1 TYPE : peptide 

i ' ' • » 

(iii) tfYPOTHETICAL : NO 

(viii) POSITION IN PROTEIN: 

(B) MAP POSITION: positions 213 to 233 of the V2 region of HCV 

type 5 

i 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 

Val Tyr Glu Ala Asp Asn L^u lie Leu His Ala 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

.(C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE:, peptide 

(iii) HYPOTHETICAL: NO 



(viii) POSITION IN PROTEIN: 

(B) MAP POSITION: positions 23 0 to 242 of the V3 region of HCV 

type 3 
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ii (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

Val Gin Asp Gly Asn Thr Ser Thr Cys Trp Thr Pro Val . • • , 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 88: 

1 i 

(i) SEQUENCE CHARACTERISTICS: ' 

(A) LENGTH: 13 amino acids ■ 

. ' (B) TYPE,: amino acid- ( 

• . ' '(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

i « 

i 

(ii) MOLECULE TYPE: peptide 

i 

(iii) HYPOTHETICAL: NO 

» ' • 

(viix) POSITION IN PROTEIN: 

(B) MAP POSITION: positions 230 to 242 of the V3 region of HCV 
type 5 , 

i . 

' ' . • 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88:' 

Val Met Thr Gly Asn Val. Ser Arg Cys Trp Val Gin lie 
i s. ' * io 

(2) INFORMATION FOR SEQ ID.NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



POSITION IN PROTEIN: 
(B) MAP POSITION: positions 248 to 257 of the V4 region of HCV 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

Val Arg Tyr Val Gly Ala Thr Thr Ala Ser 
15 io 

(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH^ 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(viii) 
type 3 
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(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(viii) POSITION IN PROTEIN: 

(B) MAP POSITION: positions 248 to 257 of the 'V4 region of HCV 
type 5 '■ 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

Ala Pro i Ser Leu Gly Ala Val Thr Ala Pro 
' 1 5 10 . 

(2) INFORMATION FOR SEQ ID NO: ^1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amind acid 

1 CO STRANDEDNESS: single ' » 
(D) TOPOLOGY: linear , 

» 

(ii) MOLECULE TYPE: peptide 

■ i • 

(iii) HYPOTHETICAL: NO 

(viii) Position in protein: ' • 

(B) MAP POSITION: positions 294 to 303 of the V5 region of HCV 

type 3 

i 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

Arg Pro Arg Arg His Gin Thr Val Gin Thr 

' 1 5 

(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



10 



(viii) POSITION IN PROTEIN: 

(B) MAP POSITION: positions 294 to 303 of the V5 region of HCV 

type 5 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 
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i 

Arg Pro Arg Gin His Ala Thr Val Gin Asn 



1 



4 



'10 



(2) INFORMATION FOR SEQ 3JD NO: 93: ' 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 anlino acids t , 

(B) TYPE: amino acid 

* (C) STRANDEDNESS :, single , 

, (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

* i 
(viii) POSITION IN PROTEIN: • 

(B) MAP POSITION: positipns 70 to 78 of HCV type 5 

(xi) SEQUENCE DESCRIPTION: SEQ IDNO: 93: 1 

' * ' | 

Gin Pro Thr Gly Arg Ser Trp Gly Gin 

1 s 1 * 

i i 
i 1 

(2)' INFORMATION FOR SEQ ID NO: 94: 

(a) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids ~ 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO " 

(vi) ORIGINAL SOURCfe: 

(C) INDIVIDUAL ISOLATE: BR33 and BR36 

(viii) POSITION IN PROTEIN: 

(B) MAP POSITION: positions 230 to 237 of the V3 region of HCV 

type 3 

(xi) SEQUENCE DESCRIPTION: ;SEQ ID NO: 94: 

Val Gin Asp Gly Asn Thr Ser Thr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 

(ii), MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 
i 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: HD10 I ' , 

(Viii) POSITION' IN PROTEIN: » . « 

, (B) MAP POSITION: positions 230 to 237 of the V3 region of ' HCV 
type 3 i - 1 . 

i 

: t 

(xi) SEQUENCE. DESCRIPTION: SEQ ID . NO: 9.5: 
Val Gin Asp Gly Asn Thr Ser Ala , ' 

(2) INFORMATION FOR SEQ 1d NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) , LENGTH: 10 amino 1 acids i 

(B) TYPE: amino acid , 

(C) .STRANDEDNESS : single 

(D) ■ TOPOLOGY: linear ' ( 
' (ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: BR36 

(viii) POSITION IN PROTEIN: ' 

(B) MAP POSITION: positions 248 to 257 of the V4 region of HCV 

type 3 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 96: 

Val Lys Tyr Val Gly Ala Thr Thr Ala Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino ^cids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: BR36 
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(viii) POSITION IN GENOME: • 
(B) MAP POSITioN: Positions 1688 , to 1707 of HCV , type 3 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

1 i 

Leu Gly Gly Lys Pro Ala He Val Pro Asp Lys Glu Val Leu Tyr Gin 
1 5 10 1 



15 



Gin iVr Asp Glu 
20 



| (2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS : 
1 (A) LENGTH : 20 amino acids 

(B) TYPE: amino ac^dt, ' . 

(C) STRANDEDNESS : single ' 
fD) TOPOLOGY*: linear , 

i • 
(ii) (MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 1 ■ 

(vx) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: HD10 » 

^ (viii), POSITION IN GENOMp : 

\ (B) MAP POSITION: positions 1688 to, 1707 of HCV type 3 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

Leu Gly Gly Lys Pro Ala Leu Val Pro Asp Lys Glu Val Leu Tyr Gin 

, Gin Tyr Asp Glu 
20 

(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 

(Viii) POSITION IN GENOME : 

(B) MAP POSITION: positions 1712 to 1731 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 

SUBSTITUTE SHEET (RULE 26) 
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Ser Gin Ala Ala Pro Tyr lie Glu Gin Ala Glri Val lie ,Ala His Gin 
1 5 10 ' 'is' 

Phe Lys Glu Lys ■ 

(2) INFORMATION FOR SEQ ID NO: 100: i 

i 

j ' (i) SEQUENCE CHARACTERISTICS': I 
' ; 1 (A) LENGTH: 20 amino acids 

(B) TYPE: amino acid - 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

1 1 • 

(iii) HYPOTHETICAL: NO • 'i t 

(vi) ORIGINAL SOURCE:' ' 
(C) INDIVIDUAL ISOLATE: BR36 
i ' 1 » 

(viii) POSITION IN GENOME: , , 

(B) MAP POSITION: positions 1724 to 1743 of HCV type 3 , 

■ i ■ 

(xi) SEQUENCE DESCRIPTION SEQ ID NO: 100: 

\ lie Ala His Gin Phe Lys Glu Lys Val Leu Gly Leu Leu Gin Arg Ala 

1 \ 5 10 ' " ■ 15 

Thr Gin Gin Gin 
20 

(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: HD10 

(viii) POSITION IN GENOME: 

(B) MAP POSITION: positions 1724 to 1743 of HCV type 3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

lie Ala His Gin Phe Lys Glu Lys lie Leu Gly Leu Leu Gin Arg Ala 
1 5 10 15 
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Thr Gin Gin Gin ' 

20 . ' 1 

(2) INFORMATION FOR SEQ- ID NO: 102: ' 

(i) SEQUENCE CHARACTERISTICS: ■ ' 

. (A) LENGTH: 20 ajnino acids 

(B) TYPE: amino acid » »' ' *' 

♦ (C) STRANDgDNESS : single 

(D) TOPOLOGY: linear 1 , 

(ii) MOLECULE TYPE : 1 peptide ' 
(iii) HYPOTHETICAL: NO 1 • • , ' 



(viii) POSITION IN GENOMJ2 * , ' ' 

(B) MAP POSITION: positions 1688 to 1707 of HCV type 5 

(xi) SEQUENCE DESCRIPTION: SBQ ID 'MO: 102: 

«« Ser 01y Ly . ,p ro Ma lU pr J ^ ^ ^ 



Gin Phe,Asp Glu 
20 



10 15 



(2) INFORMATION! FOR SEQ ID. NO: 103: 

(i) SEQUENCE CHARACTERISTICS * 

(A) LENGTH : 20 amino acids 

(B) TYPE: amino acitf 

• (C) STRANDEDNESS : . single 
(D) TOCOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(viii) POSITION IN GENOME: 

(B) MAP POSITION: positions 1688 to 1707 



103: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
Leu ser Gly Lys Pro Ala life Ile Pro Asp Arg Glu 



Val Leu Tyr Gin 



10 15 



Gin Phe Asp Glu 
20 



(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 20 amino acids 
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i 

(B) TYPE: amino acid • 

(C) STRANDEDNESS : single . 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(viii) POSITION IN GENOME: [ J 

' (B) MAP POSITION: position 1712 to 1731 of HCV type 5 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 

i 

Ser Ala Ser Leu Pro Tyr Met Asp Glu Thr Arg Ala lie Ala Gly Gin 

l ■ ' 5 ' ■ ' ■ 10 ' 15 

. 'i • . 

Phe Lys Glu Lys ( 

i 20 ' , 

(2) INFORMATION FOR SEQ ID NO: 105: ' • 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2 0 amino, acids 

(B) TYPE: amino acid , 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) ^QLECULE TYPE : peptide ' *. 

(iii) HYPOTHETICAL: NO 



(viii) POSITION IN GENOME: 

(B) MAP POSITION: positions 1724 to 1743 of HCV type 5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

lie Ala Gly Gin Phe Lys Glu Lys Val Leu Gly Phe lie Ser Thr Thr' 
1 , 5 10 15 

Gly Gin Lys Ala 
20 

(2) INFORMATION FOR SEQ ID NO: 106: 

-» 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 
(iii) HYPOTHETICAL: NO 
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(iii) ANTI -SENSE: NO 



(vii). IMMEDIATE SQURCE: m 

(B) CLONE: GB48-3-10 

(ix) FEATURE: 

. (A) NAME/KEY: CDS 
(B) LOCATION: 2. .340 

t \ 



170 PCT/EP94/01323 



i 



(Xi) SEQUENCE DESCRIPTlbN: SEQ ID NO: 106: 

C S *f ^ GAC ATC AGG * TC GAG GAG GTC TAT 

Ser Thr Val Thr Glu Lys Asp He Arg Val Glu Glu Glu vll Sr 



io 15 



CAG TGT TGT GAC CTG GAG CCC GAA GCC CGC AAG GCA ATT ACC GCC CTA 
Gin Cys cys Asp ^eu Glu Pro Glu Ala Arg-£ S Ala ill J£ S £u" 



20 I 

25 30 

i 



ACA GAG AGA CTC TAG GTG GGC GGT CCC ATG CAT AAC AGC AAG GGA GAC 
Thr Glu Arg Leu Tyr Va} Gly Gly Pro Met His Asn sir £y*s GlJ ,Sp 

CTG TGC GGG TAT CGC AGA TGT CGC GCA AGC GGC GTC TAC ACC ACC AGC 
L6U <*» G1 / Q T ^ *** *** <*• ^g Ala Ser Gly Val Tyr T^hr Jhr Ser 
50 ■ ' 55 fe0 

p!f f CTG ACG TGC ^ AC CTC AAA GCC TCA GCC OCT ATC AAA 

Phe Gly Asn Thr- Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala S iyt 

65 70 - • 75 1 

111 i?° T CTG AGA TGC ACC ATG TTG GTC TGT GGT GAT GAC CTG 

80 ^ ^ T ^ Thr MSt LeU Val <*■ Gly Asp 2p lIu 

©5 * > - - 



46 

i 



94 



142 



190 



238 



286 



334 



Sal ?a? S a° £° c GC ° AT GGC GTA GAG GAG GAC *** GGA CCC CTC 
Val Val He Ala Glu Ser Asp Gly Val Glu Glu Asp Lys Arg Pro Leu 

100 105 no 

GGA GCC 

Gly Ala 340 



(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 113 amino- acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

Ser Thr Val Thr Glu Lys Asp He Arg Val Glu Glu Glu Val Tyr Gin 

5 10 



15 
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\ 

I 

Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Ala lie Thr Ala L'eu Thr 
•20 25" 30 

Glu Arg Leu Tyr Val Gly Gly Pro Met His Asn Ser Lys Gly Asp Leu 
, 35 40 ' 45 

Cys Gly Tyr Arg Arg Cys Ar^ Ala Ser G^y Mai Tyr Thr Thr Ser Phe 
50 55 60 

♦ ' . ' r ■ \ i 
Gly Asn ( Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala lie Lys Ala 

65 70 , S '75 ,80 . 

. 1 
Ala Gly Leu Arg Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val 

85 ( 90 95 

Val lie Ala Glu Ser Asp Gly Val Glu Glu Asp Lys Arg Pro Leu Gly 

100 I 1 105 110 

■ i 

■ . • ' ■ 

Ala 1 

(2) INFORMATION FOR SEQ'IDNO: 108: .' ' 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 base pairs 

(B) TYPE: nucleic acid , 

(C) ' STRANDEDNESS: single . ' ' 

* , (D) TOPOLOGY: linear 

i ^ .i 

(iii MOLECULE TYPE : cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO ' 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: GB116-^-5 

( ix) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 2.. 340 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 

C TCC ACT GTA ACC GAA AAG GAC ATC AGG GTC GAG GAG GAG GTA TAT 46 
Ser Thr Val Thr Glu Lys Asp He Arg Val Glu Glu Glu Val Tyr 
1 5 10 15 

CAG TGT TGT GAC CTG GAG CCC GAG GCC CGC AGA GCA ATT ACC GCC CTA 94 
Gin Cys Cys Asp Leu Glu Pro Glu Ala Arg Arg Ala lie Thr Ala Leu 
20 25 30 

ACA GAG AGA CTC TAC GTG GGC GGT CCC ATG CAT AAC AGC AGG GGA GAC 142 
Thr Glu Arg Leu Tyr Val Gly Gly Pro Met His Asn Ser Arg Gly Asp 
35 40 45 
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CTG TGC GGG TAT CGC AGA TGC CGT GCG AGC GGC GTC TAC ACC ACC AGC 190 
Leu Cys Gly Tyr Arg Arg , Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser' 
50 55 60 

TTC GGG AAC ACA CTG ACG TGC TAT' CTC AAA GCC TCA GCC GCT ATC AGA 238 
Phe Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala lie Arg 
€5 , , 70 75 

GCG GCG GGG CTG AGA GAC TGC ACC ATG TTG GTC TGT GGT GAT'GAC CTG 286 
Ala Ala Gly Leu Aro. Asp fcys Thr, Met Leu Val Cys Gly Asp Asp Leu 

I pO ■ • 85 . 90 .' 95 

GTC GTC ATT GCT GAA AGC GAT GGC GTA .GACs GAG GAC AAA CGA GCC CTC 
Val Val He Ala Glu Ser Asp Gly Val Glu Glu Asp Lys 'Arg Ala Leu 
i 100 105 no 



GGA GCC 
Gly Ala 



(2) INFORltfATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 113 amino acids 
<B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

1 i 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109:' ' 

Ser Thr Val Thr Glu Lys Asp He Arg Val Glu Glu Glu Val Tyr Gin 
1 . 5 io 15 

Cys Cys Asp Leu Glu Pro Glu Ala Arg Arg Ala He Thr Ala Leu Thr 
20 25 30 

Glu Arg Leu Tyr Val Gly Gly Pro Met His Asn Ser Arg Gly Asp Leu 
35 40 45 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser Phe 
50 55 60 

Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala He Arg Ala 
65 70 75 80 

Ala Gly Leu Arg Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val 
85 90 95 

Val He Ala Glu Ser Asp Gly Val Glu Glu Asp Lys Arg Ala Leu Gly 
100 105 no 

Ala 



334 



340 



(2) INFORMATION FOR SEQ ID NO: 110: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cbNA 

(iii) HYPOTHETICAL: NO, 

i ■* 
(iii) ANTI-SENSE: NO 



(vii) IMMEDIATE SOURCE: 
' (B) CLONE: GB215-3-8 

(ix) FEA'tURE : ' 1 
(A) NAME/KEY: CDS ' 
• (B) LOCATION: 2.. 340 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 



C TCC ACT GTA ACC GAA AAA GAC ATC AGG GTC GAG GAG GAG GTA TAT 
Ser Thr Val Thr Glu Lys Asp lie Arg Val Glu Glu* Glu Val Tyr 
1 . • 5 10 15 



46 



CAG TGT TGT GAC CTG 'GAG CCC GAA GCC CGC AAG GTA ATT ACC GCC CTA 
\ Gin Cys cys Asp Leu Glu Pro Glu Ala Arg Lys Val lie Thr Ala Leu 



20 



25 



30 



94 



ACA GAG AGA CTC TAT GTG GGC GGT CCC ATG CAT AAT AGC AAA GGA GAC 
Thr Glu Arg Leu Tyr Val Gly Gly Pro Met His Asn Ser Lys Gly Asp 
35 40 45 



142 



CTG TGC GGG TAT CGC AGA TGC CGC GCA AGC GGC GTC TAC ACC ACC AGC 
Leu Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser 
50 55 60 



190 



TTC GGG AAC ACA CTG ACG TGC TAT CTC AAA GCC TCA GCC GCC ATC AGG 
Phe Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala lie Arg 
65 , 70 75 



238 



GCG TCA GGG CTG AGA GAC TGC ACT ATG CTG GTC TAT GGT GAC GAC CTG 
Ala Ser Gly Leu Arg Asp Cys Thr Met Leu Val Tyr Gly Asp Asp Leu 
80 85 90 95 



286 



GTC GTC ATT GCC GAG AGC GAT GGC GTA GAG GAG GAC AAA CGA GCC CTC 
Val Val lie Ala Glu Ser Asp Gly Val Glu Glu Asp Lys Arg Ala Leu 
100 105 110 



334 



GGA GTC 
Gly Val 



340 



(2) INFORMATION FOR SEQ ID NO: 111: 
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» 

<i) SEQUENCE CHARACTERISTICS: I 

(A) LENGTH: 113 amino acids 

(B) TYPE: amino acid , 

(D) TOPOLOGY: linear , 
(ii) MOLECULE TYPE: protein , • 

(xi) . SEQUENCE DESCRIPTION: SEQ ID NO,: Ills 

> - 

Ser. Thr Val Thr Gly Lys Asp ( lie Arg VaJ. Glu Glu Glu Va'l Tyr Gin , 
1 , . 5 ' 10 15 

Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Val lie Thr Ala Leu Thr 
20 : ,25 30 

Glu Arg Leu Tyr Val Gly Gly Pro Met His Asn Ser Lys Gly Asp Leu 
35 40 . 45 

I • 

Cys Gly Tyr Arg Arg Cys 'Arg Ala Ser Gly Val Tyr Thr Thr Ser Phe 

50 '55 . • , ' 60 

Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala ,'Ser Ala Ala lie Arg 'Ala 
65 • , , 70 ' 75 ,80 

Ser Gly Leu Arg Asp Cys Thr Met Leu Val Tyr Gly Asp 1 Asp Leu Val 
• '85 1 90 95 

Val« lie Ala Glu Ser Asp Gly Val Glu Glu Asp Lys Arg Ala Leu Gly 



1 100^ , 105 no 



Val 



(2) INFORMATION FOR SEQ ID NO: 112: 

• • • .' 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : ' linear 

(ii) MOLECULE TYPE: CDNA 
(i'ii) HYPOTHETICAL: NO 
', (iii)ANTI- SENSE: NO 

(vii) IMMEDIATE SOURCE : ! 

(B) CLONE: GB358-3-3 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 2.. 340 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 
C TCC ACT GTA ACC GAA AAG GAC ATC AGG GTC GAG GAG GAG GTG TAT 46 
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Ser Thr Val Thr Glu Lys Asp lie Arg Val Glu Glu Glu VaX Tyr ' 

1 5 10 ,. c i • 15 

CAG TGT TGT GAC CTG GAG CCC %AG GCC CGC AAG GCA ATT ACT GCC CTA . 94 

Gin Cys Cys Asp Leu Glu Pro Glu Ala Arg -Lys Ala lie Thr Ala Leu -< 
20 25 i ' 30 

i 

ACA GAG AGA CTC TAT GTG GGfc GGT CCC A^G CJAT AAC AGC AAG GGA GAC 142 

Thr Glu Arg Leu Tyr Val Gly Gly Pro Met His Asn Ser Lys Gly. Asp 

♦ 35 , • r 40 , 4fe , 

CTG TGt'gGG.TAT CGC AGA ,TGC C^C GCA AGC 1 GGC GTC TAC ACC ACC AGC, 190 
Leu Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser 

50 ' : , 55 *° ' ■ ' 

TTC GGG AAC ACA CTG ACG TGC TAC CTC AAA GCC ' TCA GCC GCT ATC AGA 23 8 

Phe Gly Asn Thr Leu Thr Cys Tyr Leu Lys A'la Ser Ala Ala lie Arg 

65 I 70 75 

i 

GCG GCG GGG CTG AGA GAC TfcC ACC ATG TTG GTC TGT GGT GAT GAC CTG* 2 86 

Ala Ala Gly Leu Arg Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu 
80 . 8t5> ' ' , 90 '95 

GTC GTC ATC GCT GAG AGC GAT GGC GTT GAG GAG GAC AAA CGA GCC CTC 3 34 

Val Val lie Ala Glu Ser Asp Gly Val Glu Glu Asp Lys *Arg Ala Leu 
■ ' 100 105 , 110 

i ' • '. 

GGA *GC t C 340 
Gly Ala » 



(2) INFORMATION FOR SEQ ID NO: 113: 

' (i) SEQUENpE CHARACTEiRISTICS : 
.(A) LENGTH: 113 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

Ser Thr Val Thr, Glu Lys Asp .lie Arg Val Glu Glu Glu Val Tyr Gin 
1 5 10 15 

Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Ala lie Thr Ala Leu Thr 
20 ; 25 30 

Glu Arg Leu Tyr Val Gly Gly Pro Met His Asn Ser Lys Gly Asp Leu 
35 40 45 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser Phe 
.50 55 60 

Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala lie Arg Ala . 
65 70^ 75 80 



BNSDOCID: <WO 9425601 A2_l_> 



SUBSTITUTE SHEET (RULE 26) 



"1 



WO 94/25601 PCT/EP94/01323 

I/O 

, Ala Gly Leu Arg Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val 

85 — , 90 95 

Val He Ala Glu Ser Asp Gly Val Glu Glu Asp Lys Arg Ala Leu Gly 
100 ( 105 ■ 110 

Ala ■ ■ , 



.(2) INFORMATION FOJ* SEQ ID NO: .114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 'base pairs 

(B) TYPE: nucleic acid 

i (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
1 t i 

(ii) MOLECULE TYPE: cDNA , », 

(iii) HYPOTHETICAL: 'NO 

i 

(iii) ANTI-SENSE: NO 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: GB549-3-6 

♦ 

(ix) FEATURE: , 

1 , (A) NAME /KEY : CD$ 



\ 



(B) LOCATION: 2.. 340 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 

C TCC ACG GTG ACC GAA AGG GAT ATC AGG ACC GAG GAA GAG ATC TAC 
Ser Thr Val Thr Glu Arg Asp lie Arg Thr Glu Glu Glu He Tyr 
1 5 10 is 

CAG TGC TGC GAC CTG GAG CCC GAA GCC CGC AAG GTG ATA TCC GCC CTA 
Gin Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Val He Ser Ala Leu 
20 25 30 

ACG GAA AGA CTC TAC GTG GGC GGT CCC ATG TAC AAC TCC AAG GGG GAC 
Thr Glu Arg Leu Tyr Val Gly Gly Pro Met Tyr Asn Ser Lys Gly Asp 
35 40 45 

CTA TGC GGG CAA CGG AGG TGC CGC GCA AGC GGG GTC TAC ACC ACC AGC 
Leu Cys Gly Gin Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser 
50 55 60 



46 



94 



142 



190 



TTC GGG AAC ACT GTA ACG TGT TAT CTC AAG GCC GTT GCG GCT ACT AGG 238 
Phe Gly Asn Thr Val Thr Cys Tyr Leu Lys Ala Val Ala Ala Thr Arg 
65 70 75 

GCC GCA GGT CTG AAA GGT TGC AGC ATG CTG GTT TGT GGA GAC GAC TTA 286 
Ala Ala Gly Leu Lys Gly Cys Ser Met Leu Val Cys Gly Asp Asp Leu 
80 85 9 o * 95 
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177 , ' 

GTC GTC ATC TGC GAG AGC GGC GGC GTA GAG GAG GAT GCA AGA GCC CTC ' 334. 
Val Val lie Cys Glu Ser Gly Gly Val Glu G,lu Asp. Ala Arg Ala Leu 
100 105' ' 110 

CGA GCC 340 
Arg Ala , 1 

• ". ' 

i 

(2) • INFORMATION FOR. SEQ ID ^6 : 115: i • ' i 

i 

(i) SEQUENCE CHARACTERISTICS: 1 . 

(A) LENGTH: 113 amino acids t 

(B) TYPE: amino ^cid ■ ■ . 
(D) TOPOLOGY: linear 

' (ii) MOLECULE TYPE: protein . / 

l • ' 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 115: 

Ser Thr Val Thr Glu Arg Asp lie Arg Thr Gjlu Glu Glu lie Tyr Gin 
1 5 . ' ■ 10 i 15 ' 

Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Val lie Ser Ala Leu Thr 
20 25 ' 30 

Glu Arg Leu T^r Val Gly Gly Pro Met Tyr Asn Ser Lys Gly Asp Leu 
35 40 45 

I' I . • 

Cys Gly Gin Arg Arg Cys Arg Ala Ser Gly , Val Tyr Thr Thr. Ser Phe 
50 55 60 

Gly Asn Thr Val Thr Cys Tyr . Leu Lys Ala Val Ala Ala Thr Arg Ala 
65 70 75 80 

Ala Gly Leu Lys Gly Cys Ser Met Leu Val Cys Gly Asp Asp Leu Val 
85 90 95 

Val He Cys Glu Ser Gly Gly Val Glu Glu Asp Ala Arg Ala Leu Arg 
100 105 110 

Ala 



(2) INFORMATION FOR SEQ ID NO: 116: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI- SENSE: NO 
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I 

(Vii) IMMEDIATE SOURCE: 

(B) CLONE: GB&09-3-1 

(ix) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION : C. .340 



. - (xi) SEQUENCE PESCRIPTION: SEQ ID NO: 116- 

\ . , • 1 , 

C'TCC ACT GTG ACT 1 GAG AGA GAC ATC AAG GTC GAA GAA GAA GTC TAT 
Ser Thr Va} Thr Glu Arg 'Asp lie L^s Val Glu Glu Glu Val Tyr 
1 • 5 io 



PCT/EP94/01323 

i 



15 



GGA GCT 
Gly Ala 



46 



94 



CAG TGT TGT GAT CTG GAG CCC GAG GCC CGC AAG GTA ATA GCC GCC CTC* 
Gin Cys Cys 'Asp Leu Glu Pro Glu i Ala Arg Lys Val He Ala Ala Leu 

20 , », 25 , ' 30 

ACG GAG AGA, CTC TAC GTC* GGC GGC CCC ATG CAT AAC AGC AAG GGA GAC 142 
Thr Glu Arg Leu Tyr Val Gly Gly Pro Met His Asn Ser Lys Gly Asp 
' ' 35 40 , 45, 

CTT TGC GGG TAT CGT AGA TGC CGC GCG AGC GGC G*A TAC ACC ACC AGC 
Leu Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val ,Tyr Thr Thr Ser 
.50 55 , 60 

• ■ 

TTC GGG AAC ACA ATG ACG TGC TAC CTT AAG GCC TCA GCA GCC ATC AGG 
Phe Gly' Asn Thr Met Thr Cys, Tyr Leu Lys Ala ,6er Ala Ala He Arg 
65. \ 70 75 



190 



238 



GCT GCG GGG CTA AAG GAT TGC ACC ATG CTG GTT TGC GGT GAC GAC CTA 286 
Ala Ala Gly Leu Lys Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu 
80 85 90 95 

GTC GTG ATC GCC GAG AGC GGT GGC GTT GAG GAG GAC AAA CGA GCC CTC 334 
Val Val lie Ala Glu Ser Gly Gly Val Glu Glu Asp Lys Arg Ala Leu 
100 105 no 



340 



(2) INFORMATION FOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 113 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 

Ser Thr Val Thr Glu Arg Asp lie Lys Val Glu Glu Glu Val Tyr Gin 
1 5 io 15 
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Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Val lie Ala Ala Leu Thr< 

20 25 ■ 30 

. . ■ .' ■ . • • 

Glu Arg Leu Tyr Val Gly Gly Pro- Met His /Vsn Ser Lys Gly Asp Leu 
35 ,40 45 

Cys Gly Tyr Arg Arg Cys *Arg Ala Ser Gly Val Tyr Thr Thr Ser Phe 
50 55 60 . 

i Gly Asn Thr Met Thr Cys Tyr Leu' Lys Ala Ser Ala Ala He Arg Ala 
■65 70 75 80 • 

i • . * 

i 

Ala Gly Leu Lys Asp Cys Th'r Met Leu Val Cys Gly Asp ,Asp Leu Val 
i 85 90 95 

i 

Val He Ala Glu Ser Gly Gly Val Glu Glu Asp Lys Arg Ala Leu Gly 
' 100 '105 110 1 

. '» ■ 

Ala 

i ' t * 

• • i 

(2) JNFORrllATION FOR SEQ ID NO: 118: • i 

i 

(i) SEQUENCE CHARACTERISTICS : ( 

(A) LENGTH: 574 base pairs 

(B) TYPE: nucleic acid , 

(C) STRANDEDNESS : 'single 

(D) TOPOLOGY: linear 

(ii) kpLECULE TYPE : cDNA ' 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: GB358-4-1 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..574 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 

ACT TGC GGC TTT GCG GAC CTC ATG GGA TAC ATC CCG CTC GTA GGC GCC 48 
Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala 
15 10 15 

CCT GTG GGT GGC GTC GCC AGG GCC CTG GCA CAC GGT GTT AGG GCT GTG 96 
Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 

GAG GAC GGG ATC AAT TAT GCG ACA GGG AAT CTT CCC GGT TGC TCT TTC 144 
Glu Asp Gly He Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 
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* 

TCT ATC TTC CTC TTG GCA C^T CTT TCG TGC CTG ACT GTT CCC ACC *CG • 192 
Ser He Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro ?hr Ser 
50 , 55 , 60 

. •» 

GCC GTC AAC TAT CGC AAT GCC TCG GGC ATC , TAT CAC ATC ACC AAT- GAC '24 0 

Ala Val Asn Tyr Arg Asn Ala Ser Gly He Tyr His ,Ile Thr Asn Asp 
65 1 70 75 80 

i ( . . 1 

TGC CCG AAC TCG AGC ATA GTG TAC GAG ACC GAG CAC CAC ATC CTA CAC 288 
Cys. Pro Asn Ser Ser He Val Tyr Glu Thr Glu His His Il'e Leu His 
... 85 go 95 



i 



CTC CCA GGG TGT TTA CCC TGC GTG AGG GTT GGG AAT CAG TCA CGC TGC 336 
Leu Pro Gly Cys Leu Pro Cys Val Arg Val Gly Asn <?ln Ser Arg Cys 1 
100 ; 105 no 

TGG GTG GCC CTC ACT CCC ACC GTG GCG GCG CCT TAC ATC GGC GCT CCG 384 
Trp Val Ala Leu iThr Pro Thr Val Ala Ala' Pro Tyr He Gly Ala Pro 
US ■' 120 125 

CTT GAA TCC CTC CGG AGT CAT GTG GAT CTG ATG GTA GGT GCC GCT ACT 432 
Leu Glu Ser Leu Arg Ser His Val Asp L£u ,Met Val Gly Ala Ala»Thr 

130 . . - 135 140 

* • i 

GCG TGC TCC GCT CTT TAC ATC GGA GAC CTG TGC GGT GGC GTA TTC TTG 4 80 

Ala Cys Ser Ala Leu Tyr He Gly Asp Leu Cys Gly Gly Val Phe Leu 

145 " • ' 150 155 ' 160 , . 

» i 

GTT GGT CAG, ATG TTC TCT TTC CAG CCG CGG CGC CAC TGG ACT ACG CAG 528 
Val Giy Gin Met' Phe Ser Phe Gin Pro Arg Arg His Trp Thr Thr Gin 
165 i7d ' 175 

GAC TGC AAT TGT TCC ATC TAC GCG GGG CAC GTT ACG GGC CAC AGG A 574 
Asp Cys Asn Cys Ser He Tyr Ala Gly His Val Thr Gly His Arg 
180 • ' IBS 190 

(2) INFORMATION FOR SEQ ID. > NO : 119: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala 
1 5 10 15 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 

Glu Asp Gly He Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 



Ser He Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Thr 



Ser 
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50 95 60 < 

Ala Val l( Asn Tyr Arg Asn Ala ^Ser Gly lie' Tyr His lie Thr Asn Asp 

65 '70 . . .75 80 • 

Cys Pro ,Asn Ser Ser lie Val Tyr Glu Thr Glu His His lie Leu His 
85 90 95 

Leu Pro Gly Cys Leu Pro Cys Val Arg Val Gly Asn Gin Ser Arg . Cys 

♦ 100 ' . ' / 105 \ lib i 

i 

i • 1 ■ , 
Trp Val Ala 'Leu Thr Pro i Thr Val Ala Ala* Pro Tyr lie Gly Ala pro. 

115 120 125 ( 

Leu Glu Ser Leu Arg Ser His Val Asp Leu Met Val Gly Ala Ala Thr 

130 135 140 

< ■ " ' 

Ala Cys Ser Ala Leu Tyr lie Gly Asp Leu Cys Gly Gly Val Phe Leu 

145 150 1 '. , . " 155 160 ' 

Val Gly Gin Met Phe Ser Phe Gin Pro Arg Arg His Trp Thr Thr Gin 
165 . » • 170 .' 175 ' 

Asp Cys Asn Cys Ser lie Tyr Ala Gly His Val Thr Gly His Arg 
18p 185 *190 

» "' 
(2) INFORMATION 1 FOR SEQ ID NO: 120: . ' 1 

(i) SEQIJENCE CHARACTERISTICS: 

(A) LENGTH: 5 74 Base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE, TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI- SENSE: NO 



(Vii) IMMEDIATE SOURCE: 

(B) CLONE: GB549-4-3 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..574 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12 0: 

ACG TGC GGC TTT GCC GAC CTC ATG GGA TAC ATC CCG CTC GTG GGC GCC 4 8 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala 
15 10 15 

CCT GTG GGT GGC GTC GCC AGG GCC TTG GCA CAT GGT GTC AGG GCC GTG 96 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 
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I ■ 

GAG GAC GGG ATT AAC TAT GCA ACA GGG AAT CTT CCC GGT TGC TCC TTT 144 
Glu Asp Gly He Asn Tyr Ala Thr Gly Asn Leu Pro Gly cys Ser Phe 

35 40 45 

TCT ATC TTC CTT CTA GCA CTT CTC TCG TGC TTG ACT GTC CCG GCC TCG ' 192 
Ser He Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 ' • 55 eo 

GCG CAG CAC TAC CGG AAC ,ATC TCG GGC ATT TAT CAC GTC ACC AAT GAC 240 
, Ala Gin His Tyr Ang Asn He Ser Gly He Tyr His Val Thr Asn Asp 
i ' 65 70 7S 80 . 

TGC CCG AAC TCT AGT ATA GTG TAT GAA' GCT GAC CAT CAT, ATC ATG CAT 
Cys Pro Asn per Ser He Val Tyr Glu Ala Asp His His He Met His 
85 so 95 

CTA CCA GGG TGT GTG CCT TGC GTG' AGA ACC GGG AAC ACC TCG 1 CGC TGC 
Leu Pro Gly Cys Val Pro Cys Val Arg Thr Gly. Asn, Thr Ser Arg Cys 
100 105 ixo 



288 



336 



384 



432 



480 



528 



TGG GTT CCT TTA ACA CCC ACT GTG GCT GCC CCC TAT GTT GGC GCG CCG 
Trp Val Pro Leu Thr Pro Thr Val Ala Ala Pro Tyr Val Gly Ala Pro 
115 120 125 

CTC GAA TCC ATG CGG CGG CAC GTG GAC TTA ATG GTG . GGT GCC GCC ACC 
Leu Glu Ser Met Arg Arg His Val Asp Leu Met Val Gly Ala Ala Thr 
"0 135, 140 ' 

GTC TGd TCG GCC CTG TAC ATC GGA GAC CTT TGC. 'GGA GGT GTC TTC CTG 
Val Cys Ser Ala Leu T^r He Gly Asp Leu Cys Gly Gly Val Phe Leu 
145 150 155 i 6 o 

GTC GGG CAG ATG TTC ACC TTC CGG CCG CGC CGC CAT TGG ACT ACC CAG 
Val Gly Gin Met Phe Thr Phe Arg Pro Arg Arg His Trp Thr Thr Gin 
165 170 , ' 175 

GAC TGC AAC TGC TCT ATC TAT GAT GGC CAC ATC ACC GGC CAT AGA A 5 74 

Asp. Cys Asn Cys Ser He Tyr Asp Gly His He . Thr Gly His Arg 
180 i 8 5 190 

(2) INFORMATION FOR SEQ ID NO; 121: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala 
1 5 10 is 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 
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i 

■» Glu Asp Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Sef Phe • 
35 40 . ' 45 ' 

Ser lie Phe Leu Leu Ala Leu Leu Ser Cys ^eu Thr Val Pro Ala Ser 
50 ,55 60 

Ala Gin His Tyr Arg Asn ile Ser Gly He Tyr His Val Thr Asn Asp 
65 70 75' i 80 

I Cys Pro Asn Ser Ser He Val Tyr* Glu Ala Asp His His Il»e Met His 
• . ' 85 90 95 

,. Leu Pro Gly Cys Val Pro Cys Val Arg ^Thr Gly Asn Thr ,Ser Arg Cys 
\ 100 105 110 

Trp Val Pro Leu Thr Pro Thr Val Ala Ala Pro Tyr Val Gly Ala Pro 
1 115' ■ 120 1 125 

■ \ , " 

Leu Glu. Ser Met Arg Arg[ His Val Asp Leu Met Val Gly Ala Ala Thr 
130 ■ » ' 135 ' 140 • 

Val Qys Ser' Ala Leu Tyr He Gly Asp Leu Cys Gly Gly Val i Phe Leu 
145 150 , 155 , 160 

Val Gly, Gin Met Phe Thr Phe Arg Pro Arg Arg His Trp Thr Thr Gin 

165 170 175 , . 

i 

Asp Cys Ash Cys Ser lie Tyr Asp Gly His He Thr Gly His Arg 
\ ' 1 • 180 ' 185 « ' 190 



(2) INFORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 574 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
■ (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: GB809-4-3 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..574 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 
ACG TGC GGC TTC GCC GAC CTC ATG GGA TAC ATC . CCG CTC GTG GGC GCC 4 8 
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Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala , 
1 5 10 15 

CCC GT1\ GGG 'GGC GTC GCC AGG,, GCC CTG GCG CAT GGC GTC AGG GCT GTG 96 
Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 

20 25 30 

j » 

GAG GAC GGG ATT AAC TAT G(?G ACA GGG AAT CTT CCC GGT TGC TCT TTC 144 
Glu Asp Gly He Asn Tyr fla Thr Gly Asn ! Leu Pro Gly Cys Ser Phe 

t 35 "i 40 45 i 

• • i 

TCT ATC TTC CTC CTG GCA CTT CTT TCG TGp CTC ACT GTC CCA GCG TCA 192 
Ser He Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val ■ Pro Ala' Ser 
50 ; 55 go 

GCT GAG CAC TAC dGG AAT GCT TCG GGC ATC TAT CAC ATC ACC AAT GAC 24 0 

AjLa Glu His Tyr Arg Asn Ala Ser Gly He ,Tyr His He Thr Asn Asp 

65 i 70 , . , \ 75 80 

i 

TGT CCG AAT TCC AGC GTA QIC TAT GA1A ACT GAC CAC CAT ATA TTG CAt 288 

Cys Pro Asn Ser Ser Val Val Tyr Glu Thr Asp His His He Leu His 
85 , • B0 ; 95, 

TTG CCG GGG' TGC GTA CCC TGC GTG AGG GCC GGG AAC GTG TCT CG* TGC 336 
Leu Pro Gly Cys Val Pro Cys Val Arg Ala Gly Asn Val Ser Arg Cys 

100 i ( 05 no 

TGG ACG CCG ( GTA ACA CCT ACG GTG GCT GCC GTA TCC ATG GAC GCT CCG , 3 84 
Trp Thr Pro Val Thr Pro Thr Val Ala Ala Val Ser Met Asp Ala Pro 
' 115 , ,120 125 

f 1 . ' 

CTC GAG TCC TTC CGG CGG CAT GTG GAC CTA ATG" GTA GGT GC?G GCC ACC 432 
Leu Glu Ser Phe Arg Arg His Val Asp Leu Met Val Gly Ala Ala Thr 
130 135 140 

GTG TGT TCT GTC CTC TAT GTT GGA GAC CTC TGT GGA GGT GCT TTC CTA 4 80 

Val Cys Ser Val Leu Tyr Val Gly Asp Leu Cys Gly Gly Ala Phe Leu 
' 145 150 155 160 

GTG GGG CAG ATG TTC ACC TTC CAG CCG CGT CGC CAC TGG ACC ACG CAG 528 
Val Gly Gin Met Phe Thr, Phe Gin Pro Arg Arg His Trp Thr Thr Gln ( 
165 170 175 

GAT TGT AAT TGC TCC ATC TAT ACT GGC CAT ATC ACC GGC CAC AGG A 574 
Asp Cys Asn Cys Ser He Tyr Thr Gly His He Thr Gly His Arg 
18'0 185 190 



(2) INFORMATION FOR SEQ ID NO: ,123: 

s 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 
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Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala 
1 5 10 . 1 15 

Pro Val * Gly Gly Val Ala Arg*Ala Leu Ala His Gly Val Arg Ala Val 
20 , 25 • 30 

i 

Glu Asp Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 

•35 1 40 , , ■ 45 

Sen lie Phe Leu Leu Ala Leu r Leu Ser Cys Leu Thr Val Pro Ala Ser i 
50, , 55 60 

' ■ / . 

Ala Glu His Tyr Arg Asn Ala Ser Gly lie Tyr His lie Thr Asn Asp ( 
65 70 , 75 ■ . .80 

Cys Pro Asn Ser Ser Val Vai Tyr Glu Thr Asp His His lie Leu His 
85 90 ' .95 

t ■ 

Leu Pro Gly Cys Val Pro 1 Cys Val Arcj Ala Gly Asn Val Ser Arg Cys, 
100 1 '105 110 

Trp Thr Pro Val Thr Pro Thr Val Ala Ala .Val Ser Met Asp Ala 'Pro 
115 . , 120 125 , 

Leu Glu Ser Phe Arg Arg His Val Asp Leu Met Val Gly Ala Ala Thr 
130 135 ' 140 , 

Val* Cys Ser Val Leu Tyr Val Gly Asp Leu Cys Gly Gly Ala Phe Leu 

145 » 150 "•■ ' 155 160 

i r i , • 

Val Gly Gin Met Phe Thr Phe Gin Pro Arg Arg His Trp Thr Thr Gin 
. 165 170 175 

Asp Cys Asn Cys Ser lie Tyr Thr Gly His lie Thr Gly His Arg 
180 , ' 185 190 

(2) INFORMATION FOR SEQ ID NO: 124: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

Cii) MOLECULE TYPE: DNA • (genomic) 
. Uii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE : 

(A) NAME /KEY : misc_f eature 

(B) LOCATION: 1..31 

(D) OTHER INFORMATION-: /standard_name= "HCV Primer HCPr206" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 
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TGGGGATCCC GTATGATACC CGCTGCTTTG A : 
(2) INFORMATION FOR SEQ ID NO: 125: , 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

i , (ii) ' MOLECULE ' TYPE: DNA (genomic) .' 
(iii) HYPOTHETICAL: NO ' 
, (iii) ANTI - SENSE : YES 

t 

» . * 

(ix) FEATURE: i. 

(A) NAME/KEY: misc_feature 
,(B) LOCATION: 1..30 , 

(D) OTHER INFQRMAT I ON : /standard_name= "HCV Primer HCPr207" 

i , 

' (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: ' 
GGCGGAATTC CTGGTCATAG CCTCCGTGAA '3 

(2) INFORMATION FQR SEQ ID NO: 126: 

(i) \ SEQUENCE CHARACTERISTICS: , . 

' (A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

i 

(ii) MOLECULE TYPE: peptide 

,(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: amino acid 

(C) INDIVIDUAL ISOLATE: GB358 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 

Val Asn Tyr Arg Asn Ala Ser Gly lie Tyr His lie 
1 5 ~ 10 

(2) INFORMATION FOR SEQ ID NO : 12 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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i 

" (iii) HYPOTHETICAL: NO ' 

(vi) ORIGINAL SOURCE: ' ' * 

(A) ORGANISM: Amino acid ( 
(C) INDIVIDUAL ISOLATE: GB549 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 

i 

Gin H:j.s Tyr Arg Asn lie Ser Gly He Tyr His Val i 

' 1 ' 5 .10 

i 

(2) INFORMATION FOR SEQ ID NO: 128: ' 

i 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(fe) TYPE i amino acid 1 

(C)STRANDEDNESS: Single 1 
. (D) TOPOLOGY : ( linear 

i 

(ii) MOLECULE TYPE: peptide 

1 ■ i 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Amino acid 

(C) INDIVIDUAL ISdLATE: GB809 



(xi) Sequence description: seq id no: 128 :' • 

Glu His Tyr Arg. Asn Ala Ser Gly He Tyr His He 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 129: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 
' (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: amino acid 

(C) INDIVIDUAL ISOLATE: GB358 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 

Val Tyr Glu Thr Glu His His He Leu His Leu 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 130: 
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(i) SEQUENCE CHARACTERISTICS: . , 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid ( 'i 

(C) STRANDEDNESS : ^single 

(D) TOPOLOGY: linear ' 

(ii) MOLECULE TYPE: peptide 

• ■■ 
(iii) HYPOTHETICAL: NQ 1 ' 

* • , ' .,- • 

(vi) ORIGINAL SOURCE: 1 , 1 

' (A) .ORGANISM: amino acid , 

(C) INDIVIDUAL ISOLATE: GB549 ' 

; i 
I • » 

(xi) SEQUENCE' DESCRIPTION: SEQ ID NO: 130: 

• i ■ 

Val Tyr Glu Ala Asp His ,Ifis He Met His Leu 
1 5 • 10 

(2) INFORMATION FOR SEQ ID NO: 131: 

(i) SEQUENCE CHARACTERISTICS: 

(A) 'LENGTH i 11 amino' acids ' 

(B) TYPE: amino acid , . 

(C) , 'STRANDEDNESS : single 

(D) TOPOLOGY: linear ~ 1 

(ii) MOLECULE TYPE: peptide 

l' i 

(iii) HYPOTHETICAL: NO ' 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: amino, acid 

(C) INDIVIDUAL ISOLATE: GB809 



(xi) SEQUENCE DESCRIPTION?: SEQ ID NO: 131: 

Val Tyr Glu Thr Asp His His He Leu His Leu 
1 5 io 

(2) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear • 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: amino acid 

(C) INDIVIDUAL ISOLATE: GB358 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0f ! 132 : * 

Val Arg Val Gly ksn Gin Ser Arg Cys Trp Val Ala Leu 

1 5 ' 10 • ( 

(2) INFORMATION FQR SEQ ID NO: 13 3: 

(i) SEQUENCE CHARACTERISTICS : . 
♦ (A) LENGTH:' 13 amino acids i • 

, (B). TYPE: amino acid 

(C) STRANDEDNESS: single 1 . 

(D) ( TOPOLOGY: linear , - 

: . , 

(ii) MOLECULE -TYPE: peptide 

• (iii) HYPOTHETICAL: NO ■ ,' 

I 

(vi) ORIGINAL SOURCE*: , . . 

(A) ORGANISM: amino acid 
(C) INDIVIDUAL ISOLATE: GB549 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 3: , 

Val Arg Thr Gly Asn Thr Ser Arg Cys Trp Val Pro Leu 

1 ' ' 5 10 ■ , ' 

(2) INFORMATlpN t FOR SEQ ID Nd: 134: / 

■ ' i 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear ' 
(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: amino acid 

(C) INDIVIDUAL ISOLATE: GB809 



. (xi) SEQUENCE DESCRIPTION:' SEQ ID NO: 134: 

Val Arg Ala Gly Asn Val Ser Arg Cys Trp Thr Pro Val 
I 5 10 

(2) INFORMATION FOR SEQ ID NO:. 135: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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PCT/EP94/01323 



(ii) MOLECULE TYPE: peptide '' 
(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE : 

(A) ORGANISM: amino acid 

(C) INDIVIDUAL^ ISOLATE: GB358 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 
Ala Pro Tyr lie Gly Ala Pro Leu Glu Ser 



10 



C2) INFORMATION FOR SEQ ID NO: 136: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino fecids 

(B) TYPE: amino acid 

(p) STRANDEDltess : single 
(D) TOPOLOGY: linear 
i . 

(ii) MOLECULE TYPE: peptide 
. (iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: amino acid 

(C) INDIVIDUAL ISOLATE: GB549 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 

Ala Pro Tyr Val Gly Ala Pro Leu Glu Ser 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 13 7: 

. (i) SEQUENCE CHARACTERISTICS: 

. (A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: amino acid 

(C) INDIVIDUAL ISOLATE: GB809 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 7 

Ala Val Ser Met Asp Ala Pro Leu Glu Ser 
15 io 

(2) INFORMATION FOR SEQ ID NO: 138: 
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il ' ■ 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

<D) TOPOLOGY: .linear 
i 

(ii) MOLECULE TYPE: peptide • 

\ ' (iii) HYPOTHETICAL: NO » 

(vi) ORIGINAL SOURCE: - 
(A) • ORGANISM : amino acid 

(C) I INDIVIDUAL ISOLATE: GB358 and GB809 

i 

' (xi) SEQUENCE DESCRIPTION: S^Q ID NO: 138: - 

• 'l i 

Gin Pro Arg Arg His, Trp Thr Thr Gin Asp 

1 ' 1 ' 5 10 , 

i i . 

(2) INFORMATION FOR SEQ ID NO: 139: ' ■ 

i 

i i 
(i) SEQUENCE CHARACTERISTICS:' , 

(A) LENGTH: 10 amino, acids 

(B) TYPE: amino acid i 

(C) STRANDEDNESS: single 
( (b) TOPOL6GY: linear 

\ 

(ii) Molecule type: peptide '< ■ 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: . . 

(A) ORGANISM: amino acid 
(C) INDIVIDUAL ISOLATE: GB549 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 

Arg Pro Arg Arg His Trp Thr Thr Gin Asp 
1 , 5 10 



(2) INFORMATION FOR SEQ ID NO: 14 0: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: amino acid 

(C) INDIVIDUAL iSOLATE: GB54 9 



PCT/EP94/01323 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: 
• Arg Pro Arg Arg His Trp Thr Thr Gin Asp 



( 2 ) INFORMATION FOR SEQ ID NO: 141: 

i ' 

t 

(i) SEQUENCE* CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS; single 
( (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ,cDNA ' 

i ■ ' 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: &6 



10 



, (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 
TGGGATATGA *foAT<3AACTG GTC 
(2) INFORMATION FOR SEQ ID NO: 142: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

c » 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: YES 



23 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 
CCAGGTACAA CCGAACCAAT TGCC \ 
(2) INFORMATION FOR SEQ ID NO: 143: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 957 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



24 
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(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO « 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(,B) LOCATION: 1. .957 '! 

(ix) FEATURE: 

(A) 1 NAME/KEY: mat_peptide 

(B) 1 LOCATION: 1 . . 954 



* » 



(Xi) SEQUENCE DESCRIPTION: sfejQ ID NO: 143: 



ATG AGC'ACA AAT CCT AAA, CCT CAA AGA AAA ACC AAA AGA AAC ACT AAC 
Met Ser Thr'Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thi: Asn 
1 5 * 10 15 



48 



CGC CGC CCA CAG GAC GTC AAG TTC CCG GGC GGT GGC CAG ATC GTT GGT 96 
Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly , 
20 .25 30 

t 

GGA GTA TAC TTG TTG CCG CGC kGG GGC CCC CGG TTG GGT GTG CGC GCG 144 
Gly Val ,Tyr Leu Leu Pro Arg Arg Gly Pro Arg I^eu Gly Val Arg Ala 1 
^5 , ' 40 '45 

ACG AGG AAA ACT TCC GAG CGG TCC CAG CCA CGT GGG AGG CGC CAG CCC 192 
Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 



ATC CCC AAA GAT CGG CGC CCC ACT GGC AAG TCC TGG GGA AAA CCA GGA 
lie Pro Lys Asp Arg Arg Pro Thr Gly Lys Ser Trp Gly Lys Pro Gly 
65 70 75 80 



240 



TAC CCT TGG CCC CTG TAC GGG AAT GAG GGC CTC GGC TGG GCA GGG TGG 
Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Leu Gly Trp Ala Gly Trp 
85 90 95 



288 



CTC CTG TCC CCC CGA GGG TCT CGC CCG TCA TGG GGC CCA ACT GAC CCC 
Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro 
100 105 110 



33 6 



CGG CAC AGG TCA CGC AAC TTG GGT AAG GTC ATC GAT ACC CTT ACG TGT 
Arg His Arg Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu Thr Cys 
115 120 125 



384 



GGC TTT GCC GAC CTC ATG GGG TAC ATC CCT GTC GTC GGC GCC CCA GTT 
Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Val Val Gly Ala Pro Val 
130 135 140 



432 



GGT GGT GTC GCC AGA GCT CTC GCG CAT GGC GTG AGA GTT CTG GAA GAC 
Gly Gly Val Ala Arg Ala^Leu Ala His Gly Val Arg Val Leu Glu Asp 
145 150^ 155 160 



480 
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GGG ATA AAC TAT GCA ACA GGG AAC TTG CCC GGT TGC TCC TTT TCT ATC , 52 8 

Gly lie Asn Tyr Ala Thr Giy Asn Leu Pro Gly Cys Ser Phe Ser He 
165 170 , 175 

u ; M ' 

TTC TTA TTG GCC CTG CTA- TCT TGT ATC ACT GTG CCG GTC TCC GGC TTG ' 576 

Phe Leu Leu Ala Leu Leu Ser Cys He Thr' Val Pro Val Ser Gly Leu 
1 180 185 190 

CAG GTC AAG AAC ACC AGC £GC TCT TAC ATG 'gTA ACC AAT GAC TGC CAG 624 
Gln^ Val Lys Asn Tfcr .Ser Ser Ser Tyr Met Val Thr Asn Asp Cys Gin 

195 ' r 200 ' 205 , ' 

. 1 ,: ■ , 

AAC AGT AGC ATC GTC TGG CAG CTC AGG GAT GCT GTT CTT 'CAC GTC 'CC<i 672 

Asn Ser Ser lie Val Trp Gin Leu Arg Asp Ala Val Leu His Val Pro ■ 
210 \ 2i5 220 ' 

GCjiG TGT GTC CCT TGT GAG GAG AAG QGC AAC fTA TCC CGC TGT TGG ATA 72 0 

Gly Cys Val Pro ( Cys Glu Gl,u Lys Gly Asn - He Ser Arg Cys Trp He 
225 230i 235 240 

CCG GTT TCG CCC AAT ATA GCT GTG AGC CAA CCT GGT GCG CTT ACC AAG 768 
Pro Val Ser Pro Asn IjLe Ala Val Ser Gin Pro Gly Ala Leu Thr, Lys 
245 250 ' 255 

• . 'i ' 

GGC CTG CGG ACG CAT ATT GAT ACC ATC ATT GCA TCC GCT ACG TTT TGC 816 
Gly Leu Arg TJir His He Asp Thr IJ.e He Ala Ser Ala Thr Phe Cys 
?60, 265 1 270 

TCT GtC CTG ( TAC ATA GGA GAC ,CTG TGT GQC GCG GTG ATG TTG GCT TCT 864 
Ser Ala Leu 4yr» He Gly Asp Leu Cys Giy Ala Val Met Leu Ala Ser 

275 280 ■■ 285 

CAA GTC TTC ATC ATC TCG CCC CAG CAT CAT AAG TTT GTC CAG GAC TGC 912 
Gin Val Phe He He Ser Pro G,ln His His Lys Phe Val Gin Asp Cys 
290 295' 300 

• AAC TGT TCC ATA TAC CCA GGC CAC ATC ACT GGA CAT CGG ATG GCG 95 7 

Asn Cys Ser He Tyr Pro Gl,y pis He Thr Gly His Arg Met Ala 
305 310 315 

i 

(2) INFORMATION FOR SEQ ID NO: 144: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 319 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: proteiii 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
1 5 10 15 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He Val Gly 
20 25 30 
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Gly Val Tyr Leu Leu Pro Ar£ Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 • 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro . Arg Gly Arg Arg Gin Pro 
50 55 60 

i 

i 

lie Pro Lys Asp Arg Arg Pro Thr Gly Lys Ser Trp Gly Lys Pro Gly 
65 70 ' , 175 80 

Tyr 'Pro Trp Pro Leu • Tyr Gly frsn Glu Glyi Leu Gly Trp Ala Gly Trp ( ' 

85 90 95 

. ' • ' i ' ' , ■ 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro , 

100 ; , 105 • 110 

Arg His Arg Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu Thr Cys 

115 120 , 1 125 

I 

Gly Phe Ala Asp Leu Met ily Tyr lie , Pro Val Val Gly Ala Pro Val . 
130 li5 ' 140 

Gly Gly Val Ala Arg Ala ' Leu Ala His Gly Val Arg Val Leu Glu Asp 
145 ■ , 150 • 155 1160 

i ■ 

Gly lie Asn Tyr, Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser lie 

• 165 '■ 170 , 175 

i ' 

» 

Phe Leu Leu Ala Leu Leu Ser Cys lie Thr Val Pro Val Ser Gly Leu 

1^0 ( ■' 185 "■ 190 

i 

.i 

Gin Val Lys Asn Thr Ser Ser Ser Tyr Met Val Thr Asn Asp Cys Gin 
195 200 205 

Asn Ser Ser lie Val Trp Gin Leu Arg Asp Ala Val Leu His Val Pro 
210' . 215 220 

Gly Cys Val Pro Cys Glu Glu Lys Gly Asn lie Ser Arg Cys Trp lie 

225 230 " ' " 235 240 

Pro Val Ser Pro Asn lie Ala Val Ser Gin Pro Gly Ala Leu Thr Lys 
245 250 255 

Gly Leu Arg Thr His lie Asp Thr lie He Ala Ser Ala Thr Phe Cys 
260* 265 270 

Ser Ala Leu Tyr He Gly Asp Leu Cys Gly Ala Val Met Leu Ala Ser 
275 280 285 

Gin Val Phe lie He Ser Pro Gin His His Lys Phe Val Gin Asp Cys 

290 295 * ' 300 

Asn Cys Ser He Tyr Pro Gly His He Thr Gly His Arg Met Ala 
305 310 315 

(2) INFORMATION FOR SEQ ID NO: 1*15 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single I 

(D) TOPOLOGY: linear , , . , 

(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: 'NO 
(iii) ANTI-SENSE: NO 

i 

i i 

(ix) FEATURE: • 

(A) NAME /KEY : ma«t_peptide ■ 

(B) ,' LOCATION: 2.. 337 

i 

(ix) FEATURE: ' 

(A) NAME/KEY: CDS » , 

(B) LOCATION: 2 . . 340 i, , ' 

.i 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 

I , 

'» ' i 

C TCA ACG GTC ACG GAG AGG GAC ATC AGA ACT GAG GAG TCC ATA TAC ,46 

.Ser Thr Val Thr Glu Arg Asp' lie Arg Thr Glu Glu Ser lie Tyr 

1 5 10 15 1 

CTT GCT TGC TCT TTA CCC GAG CAG GCA CGG ACT GCC ATA CAC TCA CTG 1 " 94 
Leu Ala Cys Ser Leu Pro Glu Gin Ala Arg Thr Ala lie His Ser Leu, 
1 i 20 , 25 30 

ACT GAG AG<3 CTT TAC GTG GGA GGG CCC ATG CTA AAC AGC AAA GGG CAA 142 
Thr Glu Arg Leu Tyr Val Gly Gly Pro Met Leu Asn Ser Lys Gly Gin 
35 40 45 

ACC TGC GGA TAC AGA CGC TGC CGC GCC AGC GGA GTG TTC ACC ACT AGC 190 
Thr Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Phe Thr Thr Ser 
50 55 60 

ATG GGA AAT ACC ATC ACG TGC TAC GTG AAG GCA CAA GCA GCC TGT AAG 238 
Met Gly Asn Thr lie Thr Cys Tyr Val Lys Ala Gin Ala Ala Cys Lys 
65 70 75 

GCT GCG GGC ATA ATT GCC CCC ACG ATG CTG GTG TGC GGC GAC GAT CTA 2 86 

Ala Ala Gly lie lie Ala Pro Thr Met Leu Val Cys Gly Asp Asp Leu 
80 85 90 95 

GTT GTC ATC TCA GAG AGT CAG GGG ACC GAG GAG GAC GAG CGG AAC CTA 334 
Val Val lie Ser Glu Ser Gin Gly Thr Glu Glu Asp Glu Arg Asn Leu 
100 105 110 

CGA GCC 34 0 

Arg Ala 



(2) INFORMATION FOR SEQ ID NO: 146: 
(i) SEQUENCE CHARACTERISTICS: 
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i( (A) LENGTH: 113 amino acids 

(B) TYPE: amino acid 

(D) TOPOLOGY: linear ' 1 

(ii) MOLECULE TYPE: protein 

• r 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146: 

Ser Thr Val Thr Glu Arg Asp lie Arg Thr Glu Glu Ser lie Tyx Leu 

I ' 1 , 5 10 ■ 15 

i 

i ■ • 

Ala Cys Ser Leu Pro Glu Gin Ala Arg Thr- Ala lie His Ser Leu Thr 

•20 25 -30 

i 

Glu Arg Leu Tyr Val Gly Gly Pro Met Leu Asn Ser Lys Gly Gin Thr, 

35 40 45 

1 * • 

Cys Gly Tyr Arg. Arg Cys Arg Ala^Ser Gly Val Phe 'Thr Thr Ser Met 
50' , ,55 60 

Gly Asn Thr lie Thr Cys Tyr Val Lys Ala Gin Ala Ala Cys Lys Ala 
65 V 70 1 75 '80 

Ala Gly lie lie Ala Pro Thr Met Leu Val Cys Gly Asp Asp Leu Val , 
85 90 95 

Val He Ser Glu Ser Gin Gly ^Thr Glu Glu Asp Glu Arg Asn Leu Arg 
100 105 , 110 

• Ala ' 

(2) INFORMATION FOR SEQ ID NO: 147: - 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 345 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

<B) LOCATION: 1..345 

(ix) FEATURE : 

(A) NAME /KEY : mat_peptide 

(B) LOCATION: 1 . . 342 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: 
ATG AGC ACA CTT CCT AAA CCA CAA AGA AAA ACC AAA AGA AAC ACC AAC 48 
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Met Ser Thr Leu Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
1 5 10 1 15 

^ 1 

M 

CCC GGC CAC AGG ACG TTA AGT TCC CAG GCG GCG GTC AGA TCG T^G GTG . ■ 96 
Pro Gly His Arg Thr Leu Ser Ser Gin Ala 1 Ala Val Arg Ser Leu Val 
.20 25 30 

GAG TTT ACG TGC TAC CAC GdA GGG GCC CCC AGT TGG GTG TGC GTG CAG 144 
Glu Phe Thr Cys Tyr His Ala Gly Ala Pro Ser Trp Val Cys Val Gin 

* ■ 35 ' • ' , 40 i 45 ■ i 

TGC GCA AGA 1 CTT £CG AGO GGT CGC AAC CTC GCA GTA GGC .GCC AAC ,CCA 192 
Cys Ala Arg £eu Pro Ser Gly Arg Asn Leu Ala Val Gly Ala Asn Pro 
50 55 60 • 



GGC GCA GGA 
Gly Al^ Gly 1 ,, 
115 



(2) INFORMATION FOR SEQ ID NO: 148: 

(ij SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 115 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 148: 
Met Ser Thr Leu Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
1 5 10 15 

Pro Gly His Arg Thr Leu Ser Ser Gin Ala Ala Val Arg Ser Leu Val 

20 25 30 

i * 

Glu Phe Thr Cys Tyr His Ala Gly Ala Pro Ser Trp Val Cys Val Gin 
35 40 45 

Cys Ala Arg Leu Pro Ser Gly Arg Asn Leu Ala Val Gly Ala Asn Pro 
50 55 .60 

Ser Pro Gly Arg Ala Glu Pro Arg Ala Gly Pro Gly Leu Ser Pro Gly 
65 70 75 80 

Thr Leu Gly Pro Tyr Met Gly Met Arg Ala Ala Gly Gly Gin Gly Gly 



TCC CCA GGG CGC GCC GAA CCG AGG GCA GGT CCT GGG CTC AGC CCG GGT 24 0 

Ser Pro Gly Arg Ala Glu Pro Arg Ala Gly>ro Gly Leu Ser Pro Gly 
65 1 70 , ' ' ' 75 80 

ACC CTT GGC CCC TAT ATG CaGA ATG 1 AGG GCT GCG GGT GGG CAG GGT GGC 288 
Thr Leu Gly Pro Tyr Met Gly Met Arg Ala Ala Gly Gly Gin Gly Gly 

85. 1 1 90 i. 95' 

' i ' » 

TCC TGT CCC CGC GCG GCT CTC GCC CGT CGT GGG GCC CAA A^G ACC CCC 336 

Ser Cys Pro Ar,g Ala Ala Leu Ala Arg Arg Gly Ala Gin Met Thr Pro 
ifco lbs , no 



345 
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■ i 

85 90 95 

i 1 

Ser Cys fro Arg Ala Ala Leu $la Arg Arg Gly Ala Gin Met Thr Pro 

100 ' ■ . 105 110 

Gly Ala ply 1 
115 

• . ' i l ' . 

(2) INFORMATION FOR SEQ it) NO: 14 9: 

■ 

(i), SEQUENCE CHARACTERISTICS : 

(AO LENGTH: 280 base pairs 1 ■ 

(B) TYPE: nucleic acid , 

(C) STRANDEDNESS :» single 

(D) TOPOLOGY: linear 

♦ (ii) MOLECULE TYPE: cDNA , ' . 

I • 
i 

i i ' • 

(ix) FEATURE: 1 

(A) NAME/KEY: CDS 

(B) LOCATION^ '2. .280 ' • 

' i . 1 . 1 

(ix) FEATURE: , 

(A) ^AME/KEY: mat_peptide 

(B) LOCATION: 2.. 277 1 , 

i 

(xi) SEQUENCE DESCRIPTION': SEQ ID *TO: 149:, 

• , i ■ 

G GCC TGT GAC CTC AAG GAC GAG GCT AGG AGG GTG ATA ACT TCA CTC 46 
Ala Cys Asp Leu Lys Asp Glu Ala Arg Arg Val lie Thr Ser Leu 
1 5 10 15 

ACG GAG' CGG CTT TAC TGT GGT GGT CCT ATG TTC AAC AGC AAG GGA CAA 94 
Thr Glu Arg Leu Tyr Cys Gly Gly Pro Met Phe Asn Ser Lys Gly Gin 
20 25 30 

CAC TGC GGT TAC CGC CGC TGC CGT GCT AGT GGG GTG CTA CCC ACC AGC 142 
His Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Pro Thr Ser 
.35 40 45 

TTC GGG AAC ACA ATC ACC TGT TAC ATC AAA ' GCA AAG GCA GCT ACC AAA 190 
Phe Gly Asn Thr lie Thr Cys Tyr He Lys Ala Lys Ala Ala Thr Lys 
50 55 60 

GCT GCC GGA ATT AAA AAT CCA TCA TTC CTT GTC TGC GGA GAT GAC TTG 238 
Ala Ala Gly lie Lys Asn Pro Ser t>he Leu Val Cys Gly Asp Asp Leu 
65 70 75 

GTC GTG ATT GCT GAG AGT GCA GGG ATC GAT GAG GAC AGA GCG 280 
Val Val lie Ala Glu Ser Ala Gly lie Asp Glu Asp Arg Ala 
80 85 90 

(2) INFORMATION FOR SEQ ID NO: 150: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 93 amino acids 
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i, (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 

• i 

Ala Cys Asp Leu Lys Asp Glu Ala Arg Arg Val lie Thr Ser Leu Thr 

1 5 . io ...... ' 15 

• I G?.u Arg Leu Tyr Cys Gly Gly Pro Met Phe Asn Ser Lys Gly Gin His 

20 25 30 • 

Cys Gly Tyr ^rg Arg Cys Arg Ala Ser Gly Val Leu Pro Thr Ser Phe 
I- , 35 40 45 

» 

I Gly Asn. Thr. He Thr Cys Tyr He ,Lys Ala Lys Ala Ala Thr ,Lys Ala 
50 55 , ,, 60, 

Ala Gly lie Lys Asn Pro Ser Phe Leu Val Cys Gly Asp Asp Leu Val 
65 70 • IS 80 

• . 

Val tie Ala Glu Ser Ala Gly He Asp Glu Asp Arg Ala ' 
85 90 1 

t 

(2) INFORMATION FOR SEQ ID NO* 151: 

..." ' 
(i) SEQUENCE CHARACTERISTICS: 
^ 1 , (A) LENGTH: 499 base pairs 

\ (B) TYPE:' nucleic acid , 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI- SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..499 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 1..496 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 

ATG AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC AAA AGA AAC ACC AAC 48 
Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
1 5 10 15 

CGT CGC CCA CAG GAC GTC AAG TTC CCG GGC GGT GGT CAG ATC GTT GGC 96 
Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 
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i 

i 

GGA GTT TAC TTG TTG CCG CGC AGG GGC CCT AGG ATG GGT GTG CGC GCG 144 
Gly Val (| Tyr Leu Leu Pro Arg ^Arg Gly Prd Arg Met Gly Val Arg Ala 

' 35 ' , 40 45 ■ ' 

ACT CGG t AAG ACT TCG GAA CGG TCG CAA CCC CGT GGA »CGG CGT CAG CCT 192 
Thr Arg Lys Thr Ser .Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 • 5fe , i 60 

ATT • CCC AAG GCG CGC CAG CCC. ACG GGC CGG TCC TGG GGT CAA CCC GGG i 240 

lie Pro, Lys Ala Arg Gin Pro Thr Gly Arg Ser Trp Gly Gin Pro Gly 
65 70 | 1 75 ,80. 

TAC CCT TGG CCC CTT TAC GCp AAT GAG GGC CTC GGG TGG GCA GGG -TGG 288 
Tyr Pro Trp Pro Leu Tyr Ala Asn Glu Gly Leu Gly Trp Ala Gly Trp 
85 '- 90 95 

♦ • ■ 

CTG CTC TCC CCT CGA GGC TCT CGG CCT AAT TGG GGC CCC AAT GAC CCC 336 
Leu Leu Ser Pro Arg Gly 'ser Arg Pro Asn Trp Gly Pro Asn Asp Pro, 
100 1 0.05 110 

CGG CGA AAA TCG CGT AAT TTfc GGT AAG GTC ATC GAT ACC CTA ACG "TGC 384 
Arg Arg Lys .Ser Arg A>sn Leu Gly Lys Val lie Asp Thr Leu Th* Cys 
115 120 125 

GGA TTC GCC GAT CTC ATG GGG TAT AT*C CCG CTC GTA GQC GGC CCC ATT 432 
Gly Phe Ala Asp ' Leu Met Gly Tyr lie Pro Leu Va^. Gly Gly Pro lie 
1}0 135 140 

GGG' GGC GTC GCa'aGG GCT CTC GCA CAC GGT, GTG AGG GTC CTT GAG GAC 4 80 

Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 
145 150 155 * 160 

GGG GTA AAC TAT GCA ACA G 499 
Gly Val Asn Tyr Ala Thr 
165 



(2) INFORMATION FOR SEQ ID NO: 152: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 166 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: $EQ ID NO: 152: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
1 5 10 15 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Met Gly Val Arg Ala 
35 40 45 



BNSDOCID: <WO 9425601 A2_t_> 



SUBSTITUTE SHEET (RULE 26) 



WO 94/25601 PCT/EP94/01323 

202 

Thr Arg Lys Thr Ser Glu Arg .Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 

lie Pro Lys Ala Arg Gin Pro Thr Gly Arg Ser Trp Gly Gin Pro Gly 

65 70 ■ 75 , 80 

Tyr Pro Trp Pro Leu Tyr. Ala Asn Glu Gly Leu Gly Trp Ala Gly Trp 

85 so . 95 



,Leu Leu Ser Pro Arg Gly Ser Arg Pro Asn Trp Gly Pro Asn Asp Pro 
100 105 ■ xio 

Arg Arg Lys Ser Arg Asn Leu Gly Lys- Val lie Asp Thr Leu Thr Cys 
115 i 120 125 ' 

Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Gly Pro life 

130 , , 135 , 140 , 

Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 
145 150 155 ' . 160 

Gly Val Asn Tyr Ala Thr , 
I . 165 

(2) INFORMATION FOR SEQ ID NO: 153: ' 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 579 b^se pairs i- 

(B) TYPEJ: nucleic acid 

' , (C) STRANDEDNESSr single » " 

\ (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

(ix) , FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1. .579 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 1..576 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 

ACG TGC GGA TTC GCC GAT CTC ATG GGG TAC ATC CCG CTC GTA GGC GGC 4 8 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Gly 
1 5 io 15 

CCC GTT GGG GGC GTC GCA AGG GCT"* CTC GCA CAC GGT GTG AGG GTC CTT 96 
Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu 
20 25 30 

GAG GAC GGG GTA AAC TAT CCA ACA GGG AAT TTA CCC GGT TGC TCT TTC 144 
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Glu Asp Gly Val Asn Tyr Pro thr Gly Asn Leu Pro Gly Cys Ser Phe • 
35 40 45 

TCT ATC TTT ATT CTT GCT CTT CTC - TCG TGT (pTG ACC GTT CCG GCC TCT 192 
Ser lie Phe lie Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 

50 , 55 60 

t 

GCA GTT CCC TAC CGA AAT GCC TCT GGG ATT TAT' CAT GTT ACC iAAT GAT 24 0 

Ala Val. Pro Tyr Arg Asn Ala Ser Gly lie Tyr His Val Thr Asn Asp 

i 1 65 , ' • 70 ! 75 i 80 

TGC CCA AAC TCT TCC ATA GTC TAT GAG GCA GAT AAC CTG ATC CTA CAC '288 

Cys Pro Asn Ser Ser lie Val Tyr Glu Ala Asp Asn Leu .lie Leu His 
i 85 90 95 

i 

GCA CCT GGT TGC GTG CCT TGT GTC ATG ACA GGT AAT GTG AGT AGA TGC 336 

Ala Pro Gly' Cys Val Pro Cys Val faet Thr Gly Asn Val Ser Arg Cys 
100 ' ''105 ■ 110 

TGG GTC CAA» ATT ACC CCT ACA CTG TCA GCC tCG AGC CTC GGA GCA GTC 384 

. Trp Val Gin lie Thr Pro Thr Leu Ser Ala Pro Ser Leu Gly Ala Val 
115' 120 1 125 ■ 

i 

i * 

ACG GCT CCT CTT CGG AGA GCC GTT GAC TAC CTA GCG GGA GGG GCT GCC ( 432 

Thr Ala. Pro Leu Arg Arg Ala Val Asp Tyr Leu Ala Gly Gly Ala. Ala 

130 • 135 . ' 140 , . 

CTC TGC TCC GCG TT^ TAC GTA GGA GAC GCG TGT GfGG GCA CTA TTC TTG' 4 80 

Leu Cys S'er Ala Leu Tyr Val' Gly Asp Ala Cys *Gly Ala Leu Phe Leu 
145 • \ ' 150 155 ' • 160 

GTA GGC CAA ATG TTC ACC TAT AGG CCT CGC CAG CAC GCT ACG GTG CAG 528 
Val Gly Gin Met Phe Thr Tyr Arg Pro Arg Gin His Ala Thr Val Gin 
165 170 ( 175 

AAC TGC AAC TGT TCC ATT TAC AGT GGC CAT GTT ACC GGC CAC CGG ATG 576 
Asn Cys Ash Cys Ser lie Tyr Ser Gly His Val Thr Gly His Arg Met 
180 185 190 

GCG 579 
Ala 



(2) INFORMATION FOR SEQ ID NO: 154: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 193 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Gly 
1 " 5 10 15 
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Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly ,Val Arg Val Leu 1 
20 25 30 , 

, 1 1 

Glu Asp''Gly Val Asn Tyr Pro M Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

i • 

Ser lie Phe lie Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 

50 . 55 ,60 

» 

Ala. Val Pro Tyr Arg Asn Ala Ser Gly lie Tyr His Val Thr Asn Asp 
65 , 70 75 80 

Cys Pro Asn Ser Ser lie Val Tyr Glu Ala Asp Asn Leu lie Leu His 

:85 ( 90 , 95( 

Ala Pro Gly Cys Val Pro Cys Val Met Thr Gly Asn Val Ser Arg Cys 
♦ 100 105 . no 

Trp Val Gin lie Thr Pro 'Thr Leu Ser Ala Pro Ser Leu Gly Ala Val 
115 i 120, " ' ' ■ 125 

Thr Ala Pro Leu Arg Arg Ala Val Asp T^r t Leu Ala Gly Gly Ala 'Ala 

130 135 140 

• . i 

Leu Cys Ser Ala Leu Tyr Val Gly Asp Ala Cys Gly Ala' Leu Phe Leu 
14 5 ,' 150 155 160 

• ■ ' 
Val. Gly Gin Met Phe Thr Tyr Arg Pro Arg Gin His Ala Thr Val Gin 
i 165 , 170 175 

• '* » 

Asn Cys Asn Cys Ser He Tyr Ser Gly His Val Thr Gly His Arg Met 
180 185 190 

Ala 



' (2) INFORMATION FOR SEQ ID NO: 155: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 579 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO ' 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..579 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 1..576 
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<xi) l( SEQUENCE DESCRIPTION: SEQ ID NO : 155: 

ACG TGC GGA TTC GCC GAC CTC GTG GGG TAC 'ATC CCG CTC GTA GGC GGC 4 8 

Thr Cys ,Gly Phe Ala Asp Leu Val Gly Tyr lie Pro Leu Val Gly Gly 
1 , 5 10 15 

1 ,»•'". 
CCC GTT GGG GGC GTC GCA AGG GCT CTC GCA CAT GGT GTG AGG GTT CTT 96 
Pro i Val Gly Gly Va'l Ala Arg ,.Ala Leu Ala His Gly Val Arg Val Leu i 
,20 25 30 

GAG GAC GGG GTG AAT TAT GCA ACA GGG AAT CTG CCT GGT TGC TCT TTC ' 144 

Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser .Phe 
35 ', 40 45 

TCT ATC TTC ATT CTT GCA CTT CTC TCG TGC CTC ACT GTC CCG GCC TCT 192 
Ser lie Phe lie Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 ' 55 ( 60 , 

GCA GTT CCC TAC CGA AAT GCC TCT GGG ATC TAT CAT GTC ACC AAT GAT 24 0 

Ala Val Pro Tyr Arg .Agri Ala Ser Gly lie Tyr His Val Thr Asn *Asp 
65 . , 70 ' 75 .! 80 

TGC CCA AAC TCJ TCC ATA GTC TAT GAG GCA GAT GAT CTG ' ATC CTA CAC 2 88 

Cys Pro Asn Ser Ser lie Val Tyr Gl'u Ala Asp Asp Leu He Leu His 

1 ' 85 90. . 95 ' 

GCA CCT GGC 'TpC GTG CCT TGT GTC AGG AAA GAT AAT GTG AGT AGG TGC 33 6 

Ala Pro Gly Cys * Val Pro Cys Val Arg Lys, Asp Asn Val Ser Arg Cys 
100 105 . 110 

TGG GTC CAA ATT ACC CCC ACG CTG TCA GCC CCG AGC TTC GGA GCA GTC 3 84 

Trp Val Gin He Thr Pro Thr Leu Ser Ala Pro Ser Phe Gly Ala Val 
115 , 120 125 ' 

ACG GCT CCC CTT CGG AGA GCC GTT GAT TAC TTG GTG GGA GGG GCT GCC 432 
Thr Ala Pro Leu Arg Arg Ala' Val Asp Tyr Leu Val Gly Gly Ala Ala 
130 135 140 

CTC TGC TCC GCG TTA TAC GTT GGA GAC GCG TGT GGG GCA CTA TTT TTG 4 80 

Leu Cys Ser Ala Leu Tyr Val Gly Asp Ala Cys Gly Ala Leu Phe Leu 
145 150 ' 155 160 

GTA GGC CAA ATG TTC ACC. TAT AGG CCT CGC CAG CAT GCT ACG GTG GAG 528 
Val Gly Gin Met Phe Thr Tyr Arg Pro Arg Gin His Ala Thr Val Gin 

165 170 175 

t 

GAC TGC AAC TGT TCC ATC TAC AGT ' GGC CAC GTC ACC GGC CAT CAG ATG 576 
Asp Cys Asn Cys Ser He Tyr Ser Gly His Val Thr Gly His Gin Met 
180 185 190 

GCA 579 
Ala 



(2) INFORMATION FOR SEQ ID NO: 156: 
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it ' . , 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 193 amino acids ■ . ' . • • , 

(B) TYPE: amino acid , 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 'protein 

t 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156: ' 

,1 ' , l ' ' 

: Thr Cys Gly Phe Ala Asp Leu Val Gly Tyr He Pro Leu Val Gly Gly 
1 5 1Q 15 

\ Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu 
1 1 20 25 30 

1 Glu Asp Gly 1 Val Asn Tyr Ala Thr 'Gly Asn Leu Pro Gly Cys Ser Phe 
35 '40'i , 45 

Ser He Phe He Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
SO -55 60 

* i 
Ala Val Pro Tyr Arg Asn Ala Ser Gly He Tyr His Val Thr Asn Asp 
55 70 75 80 

i 

i • ■ 

Cys Pro. Asn Ser Ser He Val Tyr Glu Ala Asp Asp Leu lie Leu His 

85 90 95 ' 

\ Ala Pro 1 Gly Cys Val Pro Cys Val Arg Lys Asp ■ Asn Val Ser Arg Cys 
* , 100 105 • no 

Trp Val Gin He Thr Pro Thr Leu Ser Ala Pro Ser Phe Gly Ala Val 
115 120 125 

Thr Ala Pro Leu Arg Arg Ala Val Asp Tyr Leu Val Gly Gly Ala Ala 
130 135 140 

Leu Cys Ser Ala Leu Tyr Val Gly Asp Ala Cys Gly Ala Leu Phe Leu 
145 150 155 160 

Val Gly Gin Met Phe Thr Tyr Arg Pro Arg Gin His Ala Thr Val Gin 
165 170 175 

Asp Cys Asn Cys Ser lie Tyr Ser Gly His Val Thr Gly His Gin Met 
180 185 190 

Ala 



(2) INFORMATION FOR SEQ ID NO: 157: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 530 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 3.. 53 0 

, I 

( ix) FEATURE : , 

(A) NAME /KEY : mat_peptide ( 

(B) ' LOCATION: 3. '.52 7 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157: 

CA CCT ACG ACA GCT CTG CTG GTG i&CC CAG TTA CTG CGG ATT CCC CAA 4 7 

Pro Thr Thr Ala Leu ,Leu Val Ala Gin Leu Leu Arg lie Pro Gin 
1 r 5 10 ' 15 

GTG GTC ATT GAC ATC ATC GCA GGG AGC CAC TGG GGG GTC TTG » TTT GCC 95 
Val Val lie Asp lie lie Ala Gjy Ser His Trp G\y Val Leu Phe Ala 
20 25 30 

GCC GCA TAC TAT GCA TCG GTG GCT AAC TGG ACC AAG GTC GTG CTG GTC , 143 
Ala Ala Tyr Tyr Ala Ser Val 'Ala Asn Trp Thr Lys Val Val Leu Val 
35 40 45 

TTG TTT cVg TTT GCA GGG GTT GAT GCT ACT ACC CAb ATT TCG GGC GGC 191 
Leu Phe Leu Phe Ala Gly Val Asp Ala Thr Thr Gin lie Ser Gly Gly 
50 55 60 

TCC AGC GCC CAA ACG ACG TAT GGC ATC GCC TCA TTT ATC ACC CGC GGC 23 9 

Ser Ser Ala Gin Thr Thr Tyr Gly lie Ala Ser Phe lie Thr Arg Gly 
65 70 75 

GCG CAG CAG AAA CTG CAG CTC ATA AAT ACC AAC GGA AGC TGG CAC ATC 287 
Ala Gin Gin Lys Leu Gin Leu lie Asn Thr Asn Gly Ser Trp His lie 
80 . 85 90 95 

AAC AGG ACC GCC CTT AAT TGT AAT GAC AGC CTC CAG ACT GGG TTC ATA 335 
Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser Leu Gin Thr Gly Phe lie 
100 105 110 

GCC. GGC CTC TTC TAC TAC CAT AAG . TTC AAC TCT TCT GGA TGC CCG GAT 383 
Ala Gly Leu Phe Tyr Tyr His Lys Phe Asn Ser Ser Gly Cys Pro Asp 
115 120 125 

CGG ATG GCT AGC TGT AGG GCC CTT GCC ACT TTT GAC CAG GGC TGG GGA 431 
Arg Met Ala Ser Cys Arg Ala Leu Ala Thr Phe Asp Gin Gly Trp Gly 
130 135 140 

ACT ATC AGC TAT GCC AAC ATA TCG~GGT CCC AGT GAT GAC AAA CCA TAT 479 
Thr lie" Ser Tyr Ala Asn lie Ser Gly Pro Ser Asp Asp Lys Pro Tyr 
145 150 155 

TGC TGG CAC TAT CCC CCA CGG CCG TGC GGA GTG GTG CCA GCC CAA GAG 527 
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Cys Trp His Tyr Pro Pro Arg Pro Cys Gly Val Val pro Ala Gin dlu » 



160 . 165 170 , 175 

GTC 1 " 

Val 



(2) INFORMATION FOR SEQ ID NO: 158: ' ' 

* ■ ' • ' i ' 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 176 amino acids ♦ 

(B) TYPE: amino acid ' 
(D) TOPOLOGY: linear 

(ii) MOLECULE', TYPE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO t 158: 

Pro Thr Thr Ala Leu Leu 1 Val Ala Gin Leu Leu Arg He Pro Gin Val 
1 5 1 ' ' 10 ~ 15 * 

Val lie Asp He He. Ala Gly Ser His Trp /Gly Val Leu Phe Ala 'Ala 



20 (2 5 



, ■ — 30 



Ala Tyr Tyr Ala Ser Val Ala Asn Trp Thr Lys Val Val 
35 » 40 t 45 



• Leu Val Leu 



Phe- Leu Phe Ala Gly Val Asp Ala Thr Thr Gin lie Ser Gly Gly Ser 

50 , 55 ■■' 60 

Ser Ala Gin Thr Thr Tyr Gly He Ala Ser Phe lie Thr Arg Gly Ala 
65 70 75 80 

Gin Gin Lys Leu Gin Leu lie Asn Thr Asn Gly Ser Trp His lie Asn 
,85 so 95 , 

Arg Thr Ala Leu Asn Cys Asn Asp Ser Leu Gin Thr Gly Phe He Ala 
100 , 105 no 

Gly Leu Phe Tyr Tyr His 'Lys Phe Asn Ser Ser Gly Cys Pro Asp Arg , 
115 120 125 

Met Ala Ser Cys Arg Ala Leu Ala Thr Phe Asp Gin Gly Trp Gly Thr 
130 . 135. 140 

lie Ser Tyr Ala Asn lie Ser Gly Pro Ser Asp Asp Lys Pro Tyr Cys 
145 150 . 155 160 

Trp His Tyr Pro Pro Arg Pro Cys' Gly Val Val Pro Ala Gin Glu Val 
1S5 170 175 

(2) INFORMATION FOR SEQ ID NO: 159: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 340 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 



, 530 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI -SENSE: NO ' 



(ix) FEATURE: 1 ■ 

(A) NAME/KEY: CDS 

(B) LOCATION: 2.. 340 

( ix ) FEATURE : 

(A) NAME /KEY: mat_peptide 

(B) LOCATION: 2.. 337 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 159: 

C TCG' ACC GTT ACC GAA CAT GAC ATA ATG ACC GAA GAG TCC ATT TAC 46 
Ser Thr* Val Thr Glu His Asp lie Met! Thr Glu Glu Ser He Tyr 
1 5 10 15 

CAA TCA. TGT GAC TTG CAG CCC G£G GCA CGC GCA GCA ATA CGG TCA CTC 94 
Gin Ser'Cys Asp Leu Gin Pro Glu Ala Arg Ala Ala He Arg Ser Leu f 
20 1 25 30 

\ ACC CAA CiGC CTC TAC TGT GGA GGC CCC ATG TAC'AAC AGC AAG GGG CAA 142 
Thr Gin Alrg Leu Tyr Cys Gly Gly Pro Met Tyr Ash Ser Lys Gly Gin 
35 40 45 

CAG TGT GGT TAT CGC AGA TGC CGC GCC AGC GGC GTT TTC ACC ACC AGT 190 
Gin Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Phe Thr Thr Ser 
50 55 . 60 

ATG GGC AAC ACC ATG ACG TGC TAC ATC AAG GCT TTA GCC TCC TGT AGA 238 
Met Gly .Asn Thr Met Thr Cys Tyr lie Lys Ala Leu Ala Ser Cys Arg 
65 70 75 

GCC GCA AGG CTC CGG GAC TGC ACG CTC CTG GTG TGT GGT GAC GAT CTT 286 
Ala Ala Arg Leu Arg Asp Cys Thr Leu Leu Val Cys Gly Asp Asp Leu 
80 85 90 95 

GTG GCC ATC TGC GAG AGC CAG GGG ACA CAC GAG GAT GAA GCA AGC CTG 334 
Val Ala He Cys Glu Ser Gin Gly Thr His Glu Asp Glu Ala Ser Leu 
100 105 . 110 

AGA GCC 340 
Arg Ala 



(2) INFORMATION FOR SEQ ID NO: 16 0: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 113 amino acids 

(B) TYPE: amino acid 
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(D) TOPOLOGY: linear 1 
(ii) MOLECULE TYPE: protein ' ( 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 160: 

Ser Thr Val Thr Glu His Asp He Met Thr Glu Glu Ser He Tyr Gin 

1 5 ' ,10 | i 15 

i ■ 

Ser 1 Cys Asp Leu Gin Pro Glu ( , Ala Arg Ala Ala lie Arg Ser Leu Thr ( 
, .20 25 30 

• ' : . • , , 

Gin Arg Leu Tyr Cys Gly Gly Pro Met Tyr Asn Ser Lys Gly Gin Gin 

35 ' , 40 ,45 , . ' 

i 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Phe Thr Thr Ser Met 
• 50 55 60 

I • '. ' 

Gly Asn Thr Met Thr Cys f Tyr He Lys Ala Leu Ala Ser Cys Arg Ala 

65 70 1 • • ■ 75 80 

Ala Arg Leu Arg Asp. cys Thr Leu Leu Val »' Cys Gly Asp Asp Leu 1 Val 

Ala He Cys Glu Ser Gin Gly Thr His Glu Asp Glu Ala 1 Ser Leu Arg 
l'OO 105 110 

• . » ' 

Ala' , * 

i .» 
. f i ( . 

(2) INFORMATION FOR SEQ ID NO: 161: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 base pairs 

(B) TY^E: nucleic acid » 
. (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

i 

(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 2.. 340; 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 2.. 337 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161: 

C TCA ACC GCC ACC GAA CAT GAC ATA TTG ACT GAA GAG TCC ATA TAC 46 
Ser Thr Ala Thr Glu His Asp He Leu Thr Glu Glu Ser He Tyr 
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1 5 ' 10 15 

CAA TCA ,^GT GAC TCG CAG CCC £AC GCA CGd GCA GCA ATA CGG TCA CTC * 94 
Gin Ser Cys Asp Ser Gin .Pro Asp Ala Arg Ala Ala lie Arg Ser Leu 

20 • 25 * 30 ' 

i 

ACC CAA CGC TTG ITC TGT GGA GGC CCC ATG TAT AAC AGC AAG GGG CAA 142 
Thr Gin Arg Leu Phe Cys Gly Gly Pro Met Tyr Asn Ser Lys Gly Gin 
35 ' 40 45 

■'' ' ' , 

CAA TGT , GGT TAT CGC AGA TGC CGC GCC AGC GGC GTC TTC ACC ACC AGT 190 
Gin Cys Gly 'Tyr Arg Arg iCys Arg Ala Ser 1 Gly Val Phe Thr Thr Ser« 

50 , 55 60 , • 

ATG GGC AAC ACC ATG ACG TGC TAC ATT AAG GCT.'TTA GCC TCC TGT AGA 238 
Met Gly Asn Thr Met Thr Cys Tyr lie Lys Ala Leu Ala Ser Cys Arg 

♦ 65 70 , 1 75 

i » 

ACC GCT GGG CTC CGG GAC TAC ACG CTq CTG GTG TGT GGT GAC GAT CAT 286 
Thr Ala Gly Leu Arg Asp T*yr Thr Leu Leu Val Cys Gly Asp Asp His 
80 85 90 95 

.>•'-'/ 
GTG GCC ATC TGp GAG AGC CAG GGG ACA CAC GAG GAT GAA GCG AAO CTG 334 
Val Ala lie Cys Glu Ser Gin Gly Thr His Glu Asp Glu Al t a Asn Leu 
. 100 105 110 

AGA GCC , 34 0 

Arg 'Ala 

P i , • 

(2) INFORMATION • FOR SEQ ID NO: 162: 

(i) SEQUENCE CHTVRACTERlSTICS : 

(A) LENGTH: 113 amino acids 

(B) TYPE: amino acid 

(D) TOPOLOGY: linear 

i" 

(ii) MOLECULE. TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 162: 

Ser Thr Ala Thr Glu His Asp lie Leu Thr Glu Glu Ser lie Tyr Gin 
1 < 5 10 15 

Ser Cys Asp Ser Gin Pro Asp Ala Arg Ala Ala He Arg Ser Leu Thr 

20 25 30 

Gin Arg Leu Phe Cys Gly Gly Pro 'Met Tyr Asn Ser Lys Gly Gin Gin 

35 40* 45 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Phe Thr Thr Ser Met 
50 55 60 

Gly Asn Thr Met Thr Cys Tyr Ile~Lys Ala Leu Ala Ser Cys Arg Thr 
65 70 75 80 

Ala Gly Leu Arg Asp Tyr Thr Leu Leu Val Cys Gly Asp Asp His Val 
85 90 95 
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Ala lie Cys Glu Ser Gin Gly Thr His Glu Asp Glu Ala Asn Leu Arg' 
100 105 110 , , 

Ala 

1 i 

(2) INFORMATION FOR SEQ ID NO: 163: 

(i) SEQUENCE (pHARAC^ERISTICS : 
\ , • tA) LENGTH: 499 base pairs • 

(B) TYPE: nucleic acid • 

(C) STRANDEDNESS : single • 

(D) j' TOPOLOGY: linear 

i ■• 
. (ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

i , 

tix) FEATURE: 

(A) NAME /KEY : CDS ' 

(B) LOCATION: 1..499 

(ixj FEATURE: ( 

(A) NAME/KEY: mat_peptide 
i , (B) LOCATION: 1..496 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 163: 

ATG AGC ACG AAT CCT AAA CTT CAA AGA AAA ACC AAA CGT AAC ACC AAC 48 
Met Ser Thr Asn Pro Lys Leu Gin Arg Lys Thr Lys Arg Asn Thr Asn 
1 5 10 15 

CGC CGC CCC ATG GAC GTT AAG TTC CCG GGT GGT GGC CAG ATC GTT GGC 96 
Arg Arg Pro Met Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 

GGA GTT TAC TTG TTG CCG CGC AGG GGC CCT AGG TTG GGT GTG CGC GCG 144 
Gly Val' Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

ACT CGG AAG ACT TCG GAG CGG TCG CAA CCT . CGT GGG AGG CGC CAA CCT 192 
Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 

ATC CCC AAG GCG CGC CGA TCC GAG GGC AGA TCC TGG GCG CAG CCC GGG 24 0 

lie Pro Lys Ala Arg Arg Ser Glu Gly Arg Ser Trp Ala Gin Pro Gly 
65 70 75 80 

TAT CCT TGG CCC CTT TAC GGC AAT -GAG GGC TGT GGG TGG GCA GGG TGG 288 
Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
85 90 95 

CTC CTG TCC CCT CGC GGG TCT CGG CCG TCT TGG GGC CCT AAT GAT CCC 336 
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'* Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Asn Asp Pro ■ 
100 105 iio , 

CGG CGG AGG TCC CGC AAC CTG GGT AAG GTC ATC GAT ACC CTA ACA TGC 384 
Arg Arg Arg Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu Thr Cys 
115 , ( 120 125 

GGC TTC GCC GAC CTC ATG GGA TAC ATC CGG CTT GTA GGC GCC OCC GTG 432 
Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala Pro Val 

i ' 130 , • 135 "'. 140 i 

• i 

GGT GGC GTC GCC AGA GCC CTG ( GCA CAC GGT GTT AGG GCT GTG GAA GAC 480 
i. Gly Gly Val Ala Arg Ala Leu' Ala His Gly Val Arg Ala Val Glu Asp . 
\ 145 l 150 155 160 

GGG ATC AAC TAC GCA ACA G . 499 

1 Gly He Asn fyr Ala Thr 1 ' 

165 ' ' • 



12) INFORMATION FOR SEQ 10 NO: 164: 

V 1 » 

(i) SEQUENCE CHARACTERISTICS: , 
(A) LENGTH: 166 amino acids 
. (B) TYPE: amino acid, 

' (D) TOPOLOGY: linear 

i 

(ii) MOLECULE TVPE : protein , 

i i ' " 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 164: ' "• • 

Met Ser Thr Asn Pro Lys Leu Gin Arg Lys Thr Lys Arg Asn Thr Asn 
1 5 10 15 . 

Arg Arg Pro Met Asp Val Lys Phe Pro Gly Gly Gly Gin He Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 

50 . ' 55 60 

He Pro Lys Ala Arg Arg Ser Glu Gly Arg Ser Trp Ala Gin Pro Gly 
65 70 75 80 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
85 90 95 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Asn Asp Pro 
100 105 110 

Arg Arg Arg Ser Arg Asn Leu Gly Lys Val He Asp Thr Leu Thr Cys 
115 120 - 125 

Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala Pro Val 
130 135 140 
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Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val Glu Asp i 
145 150 



Gly lie 'Asn Tyr Ala T^hr 
165 



155 

i 



160 



(2) INFORMATION FOR SEQ ID, NO: 165: 

i 

, (i) SEQUENCE .CHARACTERISTICS: 

(A) LENGTH: 499 base pairs 
• (B) TYPE: nucleic acid 
(C) STRANDEDNiiSS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE ' TYPE : DNA (genomic) 

♦ 

(iii) HYPOTHETICAL: NO , 

I 

(iii) ANTI-SENSE: NO , "■ •■" 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 165: 1 

» » 

ATGAGCACGA AT9CTAAACC TCAAAGAAAA ACCAAACGTA ACACCAACCG CCGCCCTATG 

i i ■ 

GACGTTAAGT TCCCAGGCGG TGGTCAGATC GTTGGCGGAG TTTACTTGTT GCCGCGCAGG 

i "i 
GGCCCCAGGT T($GGTGTGCG CGCGACTCGG AAGACTTCGG AGCGGTCGCA ACCTCGTGGG 

AGGCGCCAAC CTATCCCCAA GGCGCGCCGA ACCGAGGGCA GATCCTGGGC GCAGCCCGGG 

TATCCTTGGC CCCTTTACGG CAATGAG<?GC TGTGGGTGGG CAGGGTGGCT CCTGTCCCCT 

CGCGGNTCTC GGNCGTCTTG GGGCCCCAAT GATCCCCGGN GGAGATCCCG CAACTTGGGT 

AAGGTCATCG ATACCCTAAC ATGC^GCTTC GCCGACCTCA TGGGATACAT CCCGCTTGTA 

GGCGCCCCCG TGGGTGGCGT CGCCAGGGCC CTGG CACATG GTGTTAGGGC TGTGGAAGAp 

GGGATCAATT ATGCAACAG 

(2) INFORMATION FOR SEQ ID NO: 166: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 126 amino acids 

(B) TYPE: amino acid , 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



60 
120 
180 
240 
300 
360 
420 
480 
499 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 166: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
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1 5 1 10 • 15 ' 

Arg, Arg Pro Met Asp Val, Lys Phe Pro Gly Gly Gly Gin lie Val Gly 

20 1 . 25. 36 . ' 

. Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg* Leu Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 

• ■ 50 ' • ' , 55 t 60 . - ( 

lie Pro Lys Ala Arg Arg Thr Glu Gl^ Arg Ser Trp. Ala Gin, Pro Gly 
65 70 75 80 ( 

Tyr Pro Trp Pro Leu Tyx Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
85 90 95 

Leu Leu Ser* Pro Argj Xaa iser Arg Xaa Ser Trp Gly Pro Asn Asp Pro 
100 , 105 liO 

Arg Xaa Arg Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu 
115 • 1 ' 120 • 125 

(2) INFORMATION FOR SEQ ID NO: 167: 

i 

(i) SEQUENCE CHARACTERISTICS:' , 

(A) ' LENGTH: 579 base pairs , ' ' 

(B) TYPE: nucleic acid 

(C) ,, SfRANDEDNESS: single 
^ ' (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

(ix) FEATURE : , 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..579 

(ix) FEATURE: 

(A) NAME/KEY: matjpeptide 

(B) LOCATION: 1 .. 579 

(xi) SEQUENCE DESCRIPTION: ^EQ ID NO: 167: 

ACA TGC GGC TTC GCC GAC CTC ATG GGA TAC ATC CCG CTT GTA GGC GCC 4 8 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala 
• 1 5 10 15 

CCC GTG GGT GGC GTC GCC AGG GCC-CTG GCA CAT GGT GTT AGG GCT GTG 96 
Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 

GAA GAC GGG ATC AAT TAT GCA ACA GGG AAC CTT CCC GGT TGC TCC TTT 144 
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» 

Glu Asp Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe, 

35 40 45 I 

■ • . . . 

TCT ATC TTC CTG TTG GCG CTC CTC TCG TGC CTG ACT GTT CCC ACA TCG 192 
Ser lie Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Thr Ser 
50 ' 55 60 

1 i 

GCC GTT AAC TAT CGC AAT GCT TCG GGC ATT TAT CAC ATC ACC ( AAT GAC 240 
Ala Val Asn Tyr Arg Asn Ala Ser Gly lie Tyr His lie Thr' Asn Asp 
'65 i 70 • • 75 , 80 

i ' " • . 

?GC CCG AAT GCA AGC ATA GTG TAC GAG ACp GAA AAT CAC ATC TTA CAC '288 
Cys Pro Asn Ala Ser lie Val Tyr Glu' Thr Glu Asn His, lie Leu His 
I 85 90 95 

i 

CTC CCA GGG TGC GTA CCC TGT GTG AGG ACT GGG AAC CAG TCG CGG TG+ 336 
Leu Pro Gly Cys Val Pro Cys Val ■ Arg. Thr Gly Asn Gin Ser -Arg Cys 
100 • . 105 110 

TGG GTG GCC? CTC ACT CCb ACA GTA GCG TCG 1 CCA TAC GCC GGT GCT CCG 384 
Trp Val Ala Leu Thr Pro Thr Val Ala Ser Pro Tyr Ala Gly Ala Pro 
115 120 , 125 , 

CTT GAG CCC TTG CGG CGT CAT GTG GAC CTG ATG GTA GGT GCT GCC ACC 432 

Leu Glu Pro Leu Arg Arg His Val Asp Leu Met Val, Gly Ala Ala Thr 1 

130, 135 . ' 140 

i 

ATG TGT TCC GCC CTC TAC ATC GGC GAC TTG TGC GGT GGC TTA TTC TTG 4 80 

Met Cys" Ser Ala Leu Tyr He Gly Asp Leu Cys» tely Gly Leu Phe Leu 

145 - * ' 150 155 • ■ 160 

GTG GGC CAA ATG TTC ACC TTC CAA CCG CGA CGT CAC TGG ACC ACT CAG 528 
Val Gly Gin Met Phe Thr Phe Gin Pro Arg Arg His Trp Thr Thr Gin 

165 170 175 

i 

GAC TGC AAT TGT TCC ATC TAC ACG GGC CAC ATT ACG GGT CAT CGG ATG 576 
Asp Cys Asn Cys Ser He Tyr Thr Gly His lie Thr Gly His Arg Met 
180 185 190 

GCA 579 
Ala 



(2) INFORMATION FOR SEQ ID NO: 168: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 193 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 168: 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala 
1 5 10 15 
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Pro Val Gly Gly Val Ala Arg -Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 1 

'i ■ ; m 

Glu Asp Gly He Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 

35 40 45 

» 

Ser He Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Thr Ser 
50 55 1 , l 60 

Ala Val Asn Tyr Arc/ »Asn Ala $er Gly lie i Tyr His He Thr Asn Asp 
65 , . 70 75 80 

Cys Pro Asn Ala Ser He Val Tyr Glu Thr Glu Asn His He Leu His 

85 , 90 ■ ■ • 95 • 

Leu Pro Gly Cys Val Pro Cys Val Arg Thr Gly Asn Gin Ser Arg Cys 
• 100 105 , ' 110 

I 1 

Trp Val Ala Leu Thr Pro ^hr Val Ala t Ser Pro Tyr Ala Gly Ala Pro ■ 
115 ' 120 1 125 

Leu Glu Pro Leu Arg Arc/ 'His Val Asp Leu Met Val Gly Ala Ala 

130 • , , 135 ' .140 i 

Met Cys Ser Ala, Leu Tyr He Gly Asp Leu Cys Gly Gly Leu Phe Leu 
145 1 150 1 155 , 160 

♦ 

Val (Sly Gin Met Phe Thr Phe Gin Pro Arg Arg His Trp Thr Thr Gin 
^.65 ' 17d- 175 

Asp Cys Asn Cys Ser He Tyr Thr Gly His lie Thr Gly His Arg Met 
180 • 185 190 

Ala 



(2) INFORMATION FOR SEQ ID ifo:' 169: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 579 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) ' TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 



» 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..579 

(ix) FEATURE: 

(A) NAME/KEY: mat_jpeptide 

(B) LOCATION: 1..576 



SUBSTITUTE SHEET (RULE 26) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169: • ' ... , 

ACA TGC GGC TTC GCC GAC CTC ATG GGA TAC 'ATC CCG CTT GTA GGC GCC 48 
Thr Cys Gly Phe Ala Asp ' Leu Met Gly Tyr lie Pro Leu Val Gly Ala 
1 5 1 i 10 15 

CCC GTG GGT GGC GTC GCC AGA GCC CTG GCA CAC GGT GTT AGG ' GCT GTG 96 
.Pro Val Gly Gly Va,l Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 

GAA GAC GGG £TC AAC TAC GCA ACA GGG ' AAT CTC CCC GGT TGC TCC TTT 144 
Glu Asp Gly fie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
• 35 40 45 

TCT ATC TTC CTC TTG GCA CTT CTC i TCG TGC CTC ACT GTT CCC 'GCG TCG 192 
Ser He Phe Leu Leu Ala Leu LeU Ser Cys Leu. Thr, Val Pro Ala Ser 

5Q 55 60 

. i 

GGC GTT AAC TAT CGC AAT GCT TCG GGC GTT TAT CAC ATC AfcC AAC GAC 240 
Gly Val Aten Tyr Arg Asn Ala Ser Gly Val Tyr His lie Thr, Asn Asp 

65 70 75 80 

1 1 ' 

TGC CCG AAT GCG AGC ATA GTG TAC GAG ACC GAC AAT , CAC ATC TTA CAC 1 28 8 

Cys Pro, Asn Ala Ser lie Val Tyr Glu Thr Asp Asn His He Leu His 

85 90 95 ' ' 

CTC CCA OGG TGC GTA CCC TGJ GTG AAG ACC GGG, AAC CAG TCG CGG TGT ■ 336 

Leu Pro Gly Cys Val Pro Cys Val Lys Thr Gly Asn Gin Ser Arg Cys 
100 105 no 

TGG GTG GCC CTC ACT CCC ACA GTG GCG TCG CCT TAC GTC GGT GCT CCG 384 
Trp Val Ala Leu Thr Pro Thr Val Ala Ser Pro Tyr Val Gly Ala Pro 
115 120 ■ 125 

CTC GAG CCC TTG CGG CGC CAT GTG GAC CTG ATG GTA GGT GCT GCC ACC 432 
Leu, Glu Pro Leu Arg Arg His Val Asp Leu Met Val Gly Ala Ala Thr 
130 , 135 140 

GTG TGC TCC GCC CTC TAC GTC GGC GAC CTG TGC GGT GGC TTA TTC TTG 480 
Val Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Gly Leu Phe Leu 
145 ! 150 155 160 

GTA GGC CAA ATG TTC ACC TTC CAA CCG CGA CGC CAC TGG ACG ACC CAG 528 
Val Gly Gin Met Phe Thr Phe Gin Pro Arg Arg His Trp Thr Thr Gin 
165 170 175 

GAC TGT AAT TGT TCC ATC TAC GCA GGG CAT ATT ACG GGC CAT CGG ATG 576 
Asp Cys Asn Cys Ser He Tyr Ala Gly His He Thr Gly His Arg Met 
180 185 190 

GCT 579 
Ala 



(2) INFORMATION FOR SEQ ID NO: 170: 
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" ' (i) SEQUENCE CHARACTERISTICS: • 

(A) LENGTH: 193 amino acids 
<B) TYPE: amino acid ' 
(D) TOPOLOGY: linear 

» 

(ii) MOLECULE TYPE :. protein 

i 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 170: i 

\ Thr Cys Gly } Phe Ala! Asp Leu Met' Gly Tyr lie Pro Leu Vai Gly Ala 
' \ *1 5 10 15 

\ j. Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
\ 120 25 30 

i 

Glu * Asp Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 

1 35 ' • ' 40 1 4,5 

• . '» • 

Ser lie Phe Leu Leu Ala f Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 

50 i 55 60 • ', 

Gly Val Asn ' Tyr Arg Asn Ala Ser Gly Val Tyr His He Thr *Asn Asp 
65 70 , 75 80 

Cys Pro .Asn Ala Ser He Val Tyr Glu Thr Asp Asn His He Leu His 

85 90 95 

i 

Leu Pro Gly Cys Val' Pro Cys Val Lys Thr Gly A,sn Gin Ser Arg Cys' 

\ 1 100 » 105 ' 110 

. \ • i . ■ ' 

i 1 

Trp Val Ala Leu Thr Pro Thr Val Ala Ser Pro Tyr Val Gly Ala Pro 
115 120 125 

Leu Glu Pro Leu Arg Arg His Val Asp Leu Met Val ( Gly Ala Ala Thr 
130 135 140 

Val Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Gly Leu Phe Leu 
145 ' 150 155 160 

Val Gly Gin Met Phe Thr Phe Gin Pro Arg Arg His Trp Thr Thr Gin 
165 170 175 

Asp Cys Asn Cys Ser lie Tyr Ala Gly His He Thr Gly His Arg Met 
180 • 185 190 

Ala 



(2) INFORMATION FOR SEQ ID NO: 171: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 579 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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I 

(iii) HYPOTHETICAL: NO, i 
(iii) ANTI-SENSE: NO , 'i 



(ix) FEATURE: 

' (A) NAME/KEY: CDS 

(B) LOCATION: 1J.579 . 
(ix) FEATURE: 

♦ (A) NAME/KEY: matjpeptide , 

(B t ) LOCATION: 1..576 - ' 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 171: , 1 

ACA TGC GGC TTC GCC GAC CTC ATG GGA TAC AT(j CCG CTT GTG GGC GCC 48 
Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala 

1 I 5 - io' 15 

i 

CCT GTT GGT GGC GTC GCC AGA GCC. CTT GCG CAC GGC GTC AGG GCT GTG 96 
Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 

20 t i , 25 ' 30 ' 

GAA GAC GGG ATT AAC *TAT GCA ACA GGG AAC CTT CCT GGT TGC TCC TTT 144 
Glu Asp Gly He Asn Tyr Ala Thr Gly Asn Leu Pro Giy Cys Ser Phe 
35 ■ ' 40 i 45 

TCT ATC TTC CTT CTG GCA CTT CTC TCG TGC CTG AfcT GTC CCC GCC TCG 192 
Ser He Phe, Leu Leu Ala Leu , Leu Ser Cys Leu Thr Val Pro Ala Ser 

50 r 1 55 ~ 60 

.» ' 1 ' 

GCT GTG CAT TAT CAC AAC ACC TCG GGC ATC TAC CAC CTC ACC AAT GAC 240 
Ala Val His Tyr His Asn Thr Ser Gly He Tyr His Leu Thr Asn Asp 
55 70 75 80 

TGC CCT AAC TCT AGC ATA GTC TTT GAG GCA GTC CAT CAC ATC TTG CAC 288 
Cys Pro Asn Ser Ser He Val Phe Glu Ala Val His His He Leu His 
85 .90 95 

CTT CCA GGA TGC GTC CCT' TGT GTA AGA ACT GGG AAC CAG TCT CGG TGC, 336 
Leu Pro Gly Cys Val Pro Cys Val Arg Thr Gly Asn Gin Ser Arg Cys 
100 105 110 

TGG GTA GCC TTp ACC CCC ACG CTG GCC GCG CCA TAC CTT GGC GCT CCA 384 
Trp Val Ala Leu Thr Pro Thr Leu Ala Ala Pro Tyr Leu Gly Ala Pro 
115 120 ^ 125 

CTC GAG TCC ATG CGG CGT CAC GTG GAT TTG ATG GTG GGC ACT GCT ACA 432 
Leu Glu Ser Met Arg Arg His Vai Asp Leu Met Val Gly Thr Ala Thr 
130 135 140 

TTG TGC TCA GCA CTC TAC GTT GGG GAC CTG TGC GGG GGC ATA TTC CTA 480 
Leu Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Gly He Phe Leu 
145 150 155 160 

GCG GGC CAG ATG TTC ACC TTC CGG CCC CGC CTC CAT TGG ACC ACC CAG 52 8 

Ala Gly Gin Met Phe Thr Phe Arg Pro Arg Leu His Trp Thr Thr Gin 
165 170 175 



BNSDOCID: < WO 942560 1 A2_l_> 



SUBSTITUTE SHEET (RULE 26) 



WO 94/25601 PCT/EP94/01323 

221 , ' 

GAG TGC AAT TGT TCC ACC TAT CCG GGC CAC ATC ACG GGT CAT AGA ATG 576 

Glu Cys £sn Cys Ser Thr Tyr Pro Gly His' lie Thr Gly His Arg Met 

180 V 185 190 ' 1 • 

GCG , 1 ' 579 

Ala 



* ■ ' * r . ' " 

(2) INFORMATION FOR SEQ ID NO: 172: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 193 amino acids 

(B) TYPE,: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 172: 

Thr Cys Gly Phe Ala Asp' Leu Met Gly Tyr lie Pro Leu Val Gly Ala 
1 . , 5 10 15, 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 2fe , 30 

Glu Asp Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 

35 ♦ J( , f 40 45 

Ser lie Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
•50 55 60 

Ala Val His Tyr His Asn Thr Ser Gly lie Tyr His Leu Thr Asn Asp 
65 70 75 SO 

Cys Pro Asn Ser Ser lie Val Phe Glu Ala Val His His lie Leu His 
85 " ' 90 95 

Leu Pro Gly Cys Val Pro Cys Val Arg Thr Gly Asn Gin Ser Arg Cys 
100 . 105 110 

Trp Val Ala Leu Thr Pro Thr Leu Ala Ala' Pro Tyr Leu Gly Ala Pro 
115 ' 120 125 

Leu Glu Ser Met Arg Arg His Val Asp Leu Met Val Gly Thr Ala Thr 
130 135 140 

Leu Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Gly lie Phe Leu 
145 150 155 160 

Ala Gly Gin Met Phe Thr Phe Arg Pro Arg Leu His Trp Thr Thr Gin 
165 170 175 

Glu Cys Asn Cys Ser Thr Tyr Pro Gly His lie Thr Gly His Arg Met 
180 185 190 

Ala 



BNSDOCID: <WO 9425601 A2J. 



SUBSTITUTE SHEET (RULE 26) 



WO 94/25601 PCT/EP94/01323 

222 



(2) INFORMATION FOR SEQ ID NO: 173: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 579 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: .linear 

\ (ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO* 

l 

• (iii) ANTI- SENSE: NO 



( ix) FEATURE : 'i i 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..579 

,(ix) 'FEATURE: ' , 

(A) NAME/KEY: mat_jpeptide 

(B) LOCATION: 1..576 

i 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 173: 

ACG TG6 GGT TCC GCC GAC CTC ATG GGA TAC ATO CCG CTC GTA GGC GCC 48 
Thr Cys 6}y Ser Ala Asp Leu Met Gly Tyr lie Pro , Leu Val Gly Ala 
1 5 10 15 

CCT GTG GGT GGC GTC GCC AGG GCC TTG GCG CAT GGC GTC AGG GCT GTG 96 
Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 , 25 ' 30 

GAG GAC GGG ATA AAC TAT GCA ACA GGG AAC CTT CCT GGT TGC TCT TTT 144 
Glu' Asp Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

TCT ATC TTC CTT CTG GCA CTT CTC TCG TGC CTG ACT GTC CCC GCC TCA 192 
Ser lie Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 55 60 

GCT GTG CAT TAT CAC AAC ACC TCG GGC ATC TAT CAC ATC ACT AAT GAC 240 
Ala Val His Tyr His Asn Thr Ser Gly lie Tyr His lie Thr Asn Asp 
65 70 75 80 

TGC CCT AAC TCT AGC ATA GTC TTT GAG GCA GAG CAT CAC ATC TTG CAT 288 
Cys Pro Asn Ser Ser lie Val Phe Glu Ala Glu His His lie Leu His 
85 90 95 

CTT CCA GGA TGC GTC CCC TGT GTG AGA ACT GGG AAC CAG TCA CGA TGC 336 
Leu Pro Gly Cys Val Pro Cys Val~Arg Thr Gly Asn Gin Ser Arg Cys 
100 105 110 

TGG ATA GCC TTG ACC CCT ACG TTG GCC GCG CCA CAC ATT GGC GCT CCA 384 
Trp lie Ala Leu Thr Pro Thr Leu Ala Ala Pro His lie Gly Ala Pro 
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115 120 125 

CTT .GAG TCC ATG CGA CGT CAT GTG. GAT TTG ATG' GTA GGC ACT GCC A£A '432 
Leu Glu Ser Met Arg Arg His Val Asp Leu Met Val Gly Thr Ala Thr 
130 135 140 

TTG TGC TCC GCA CTC TAC &TT GGA GAT CTG TGC GGA GGC ATA TTT CTA 4 80 

Leu Cys Ser Ala Leu Tyr lie Gly Asp Leu Cys' Gly Gly He ,Phe Leu 
145 150 , 155 160 

\ GTG GGC CAG ATG TTC AAC TTC AGG CCC C6C CTG CAC TGG ACC ACC CAG 528 
Val Gly Gin Met Phe Asn Phe Arg Pro Arg, Leu His Trp Thr Thr Gin 
165 ' 170 , 175 

I 

GAG TGC AAT TGT TCC ATC TAT CCA GGC CAC ATC ACG GGT CAC AGA ATG 576 
Glu Cys Asn Cys Ser He Tyr Pro Gly His lie Thr Gly His Arg Met 
'180 • 185 190 1 

■ • 'i . ' ■ 

GCG ( 579 

Ala ■ i ■ ' 



(2) INFORMATION FOR SEQ ID NO: 174: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 193 amino acids 

(B) TYPE:' amino acid 
' 4 (D) TOPOLOGY: linear 

. \ ' 1 • . • • 

i 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 174: 

Thr Cys Gly Ser Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala 
1 5 10 15 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 

Glu Asp Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
,35 40 45 

Ser lie Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 55 60 

Ala Val His Tyr His Asn Thr Ser Gly lie Tyr His lie Thr Asn Asp 
65 70 75 80 

Cys Pro Asn Ser Ser lie Val Phe Glu Ala Glu His His lie Leu His 
85 90 95 

Leu Pro Gly Cys Val Pro Cys Val Arg Thr Gly Asn Gin Ser Arg Cys 
100 -105 110 

Trp lie Ala Leu Thr Pro Thr Leu Ala Ala Pro His lie Gly Ala Pro 
115 120 125 



i 
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» . ' ■ ■ 

Leu Glu Ser Met Arg Arg Hi,s Val Asp Leu Met Val Gly Thr Ala thr i 
130 135 140 

. . '■ 

Leu Cys ''Ser Ala Leu TJyr lie *31y Asp Leu Cys Gly Gly He Phe Leu 
145 150 ' 155 . 160 

Val Gly»Gln Met Phe Asn Phe Arg Pro Arg Leu His Trp Thr Thr Gin 
165 , 170 . 175 

Glu ( Cys Asn Cys Sqr Jle Tyr Pro Gly His He Thr Gly His Arg Met 

180 ' 185 ' 190 , ' 

Ala ■ 1 ■ 

; • 

(2) INFORMATION F6R SEQ ID NO: 175: 

i i 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 579 base pairs 

(B) TYPE: nucleic acic} • \ * 

(C) STRANDEDNESS : single 

(D) TOPOLOG^:, linear. ■ ■ , 
(ii) MOLEdULE TYPE: cDNA ' ' 

(iii) HYPOTHETICAL: NO ., 

i 

• ■ , r 

(iii) ANTI- SENSE: NO 

(ix) FEATURE: ' 

(A) NAME/KEY: CDS 

(B) LOCATION: 1.. 579 

(ix) FEATURE: . 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: l/.576 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 175: 

ACG TGC GGC TTT GCC GAC CTC ATG GGA TAC ATC CCG CTC GTG GGC GCC 48 
Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala 
1 ♦ 5 10 15 

CCT GTG GGT GGC GTC GCC AGG GQC TTG GCA CAT GGT GTC AGG GCC GTG 96 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 ,25 30 

GAG GAC GGG ATT AAC TAT GCA ACA GGG AAT CTT CCC GGT TGC TCC TTT 144 

Glu Asp Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 

35 40 45 

TCT ATC TTC CTT CTA GCA CTT CTC TCG TGC TTG ACT GTC CCG GCC TCG 192 

Ser lie Phe Leu Leu Ala Leu Leu~Ser Cys Leu Thr Val Pro Ala Ser 
50" 55 60 

GCG CAG CAC TAC CGG AAC ATC TCG GGC ATT TAT CAC GTC ACC AAT GAC 240 

Ala Gin His Tyr Arg Asn lie Ser Gly lie Tyr His Val Thr Asn Asp 
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ii 65 70 75 80 . 

TGC CCG AAC TCT AGT ATA GTG TAT GAA GCT GAG ■ CAT ' CAT ATC' ATG CAT 288 
Cys Pro Asn Ser. Ser He Val Tyr ,Glu Ala Asp His His He Met His 
85 90 95 ' 

CTA CCA GGG TGT GTG CCt' *tGC GTG AGA ACC GGG AAC ACC TCG CGC TGC 336 
Leu Pro Gly Cys Val Pro Cys Val Arg Thr Gly 'Asn Thr Ser frrg Cys 
100 , 105 - lit) 

l ' , 1 " 1 

'I TGG GTT.CtT TTA ACA CCC ACT GTG GCT GCC CCC TAT GTT GGC GCG CCG . 384 

Trp Val Pro Leu Thr Pro Thr Val Ala Ala, Pro Tyr Val Gly Ala Pro 
115 , ' 120 ' 125 , 

CTC GAA TCC ATG CGG CGG CAC GTG GAC TTA ATG GTG GGT GCC GCC ACC 432 

Leu Glu Ser Met Arg Arg His Val Asp Leu Met Val Gly Ala Ala Thr' 
I 130 * . 135 ' . 140 

• ■ • '» . 

GTC TGC ,TCG GCC CTG TAC ATC GGA GAC CTT TGC GGA GGT GTC TTC CTG 480 

Val Cys Ser , Ala Leu Tyr* He Gly Asp Leu Cys Gly Gly Val Phe Leu 
145 150 • 155 ' 160 

GTC GGG CAG ATG TTC ACC TTC CGG CCG CGC CGC CAT TGG ACT ACC CAG 528 
Val Gly Gin Met Phe Thr Phe Arg Pro Arg Arg His Trp Thr Thr Gin 
165 170 175 

GAC TGC AAC TGC TCT ATC TAT GAT GGC CAC ATC ACC GGC CAT AGA ATG 576 
Asp Cys Asn Cys Ser' He Tyr Asp Gly His He Thr Gly His Arg Met * 
t ' ■ . 180 • 185 ' ' 190 

GCT 579 
Ala 



(2) INFORMATION FOR SEQ ID NO: 176: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 193 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 176: 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala 
1 5 10 15 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 

Glu Asp Gly He Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

Ser He Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 55 60 
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Ala Gin His Tyr Arg Asn He Ser Gly lie Tyr His yal Thr Asn J^sp 1 

65 70 .75 ,80 

Cys Pro ''Asn Ser Ser lie Val "Tyr Glu Ala Asp His His lie Met His, 

85 90, 95 

Leu Pro Gly Cys Val Pro Cys Val Arg Thr Gly Asn ,Thr Ser* Arg Cys 
100 105 110 

Trp Val Pro Leu Thr Pro Thr Val Ala Ala Pro Tyr Val Gly Ala Pro 

♦ 115 , . 120 . , 125 1 , 

Leu Glu* Ser, Met Arg Arg, His Val Asp Led Met Val Gly Ala Ala Thr, 

130 ' 135 140 

; 1 
. i ' • • 

Val Cys Ser Ala Lfeu Tyr lie Gly Asp Leu Cys Gly Gly Val Phe Leu 

145 150 \ 155' 160 

» t 

Val Gly Gin Met iPhe Thr Phe Arg Pro Arg *Arg His Trp Thr Thr Gin 

165 170 175 

Asp Cys Asn Cys Ser He Tyr Asp Gly His He Thr Gly His Arg Met 
180 , , 185 ; 190 « 

Ala ' ' . ' 



(2) INFORMATION. FOR SEQ ID NO: 177: 

(i) SEQUENCE CHARACTERISTICS : 

(A1 LfeNGTH: 579 base pairs 

(B) TYPE: nucleic acid 

(C) SJTRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..579 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 1 . 576 » 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177: 

ACG TGC GGG TTC GCC GAC CTC ATG GGA TAC ATC CCG CTC GTG GGC GCT 48 
Thr Cys Gly Phe Ala Asp Leu Met~X3ly Tyr lie Pro Leu Val Gly Ala 
1 5 10 15 

CCA GTA GGA GGC GTC GCC AGA GCC TTG GCG CAT GGC GTC AGG GCT GTG 96 
Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
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20 • 25 30 1 

GAG GAC ( GGG ATC AAT TAC GCA ACA GGG AAC drT CCC GGC TGC TCC TTT 144 
Glu Asp Gly lie Asn Tyr Ala M Thr Gly Asn Leu Pro Gly Cys Set Phe 
35 40 45 

t 

i 

TCT ATC TTC CTC TTG GTA CTT CTC TCG CGC CTA ACT GTC CCA GCG TCT 192 
Ser lie Phe Leu Leu Val Leu Leu Ser Arg tyeu Thr V<al Pro Ala Ser 
50 55 €0 

i 

• . ' . ' • • • . i 

GCT CAG ( CAC TAC CGG AAT GCA TCG GGC ATC TAC CAT GTC ACC AAC GAC * 24 0 

Ala Gin His .Tyr Arg Asn |Ala Ser Gly lie 1 Tyr His Val Thr Asn Asp, 
65 70 75 80 . 

; ■ l 

TGC CCG AAC TCC A0T ATT GTG TAT GAA GCC GAC CAT CAC ATC ATG CAC 288 
Cys Pro Ash Ser Ser lie Val" Tyr Glu Ala Asp His His lie Met His 
85 90 • 95 

CTA CCC GGG TGT GTG CCC *TGT GTA AGA ACT GGG AAT GTC TCG CGT TGC, 336 

. » ■ ■ 

Leu Pro Gly Cys Val Pro C^s Val Arg Thr Gly Asn Val Ser Arg Cys 
100 105 110 

TGG ATT CCT TTfl ACA CCC ACT GTA GCC GTC CCC TAC CTC GGG GCT, CCA 3 84 

Trp lie Pro Leu Thr Pro Thr Val Ala Val Pro Tyr Leu Gly Ala Pro 
115 120 125 ' 

- 

CTT ACG TCT GTA 1 CGG CAG CAT GTG GAC CTG ATG GTG GGG GCG GCC ACC * ' 432 
Leu Thx Ser Val Arg Gin His Val Asp Leu Met Val Gly Ala Ala Thr 

130 i 135 "■• 140 

ill . ■ ' 

TTA TGC TCT GCC CTC TAC ATC GGA GAC CAT TGC GGA GGT GTC TTC TTG 480 
Leu Cys Ser Ala . Leu .Tyr lie Gly Asp His Cys Gly Gly Val Phe Leu 
145 150 155 160 

GCA GGG CAG ATG GJC AGT TTC CAA CCC CGG CGT CAT TGG ACT ACC CAG 528 
Ala Gly Gin Met Val Ser Phe Gin Pro Arg Arg His Trp Thr Thr Gin 
165 170 175 

GAT TGC AAC TGT TCC ATC TAT GTG GGC CAC ATC ACC GGC CAC AGG ATG 576 
Asp Cys Asn Cys Ser He Tyr Val Gly His He Thr Gly His Arg Met ' 
180 185 190 

GCC 579 
Ala 



(2) INFORMATION FOR SEQ ID NO: ^.78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 193 amino acids 

(B) TYPE: ami:-o acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 178: 
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Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala 

1 5 10 .15 »' 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 

20 25 30 

i 

Glu Asp Gly lie Asn Tyr* Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 

35 40 - 45 * 

i 

Ser He Phe Leu Leu Val Leu Leu, Ser Arg Leu Thr Val Pro Ala Ser 

•1 , 50 ' 1 55 60.. 

Ala Gin His Tyr Arg Asn Ala Ser Gly lie' Tyr His Val Thr Asn Asp 

" * 65 ■ ( " 70 75 ' 80 

Cys Pro Asn Ser Ser He Val Tyr Glu Ala Asp His His He Met His 1 
• ,85 , 90 ,95 

Leu Pro Gly Cys Val Pro Cys Val Arg Thr Gly Asn Val Ser Arg Cys 
, 1Q0 1 105 , 110 

' » 
Trp He Pro, Leu Thr Pro Thr Val Ala Val Pro Tyr Leu Gly Ala Pro 
115 120 125 ' 

Leu Thr Ser Val Arg Gin His Val Asp Leu Met Val Gly Ala Ala Thr • 
130' 135 ■ 140 ' 

Leu Cys Ser Ala Leu Tyr He Gly Asp His Cys Gly Gly Val Phe Leu 
^145 i , 150 ( 155 , ' 160* 

Ala Gly Glh Met Val Ser Phe Gin Pro Arg Arg His trp Thr Thr Gin 
165 170 175 

Asp Cys Asn Cys Ser He Tyr Val Gly His He Thr Gly His Arg Met 
180 185 ■ 190 

Ala 



(2) INFORMATION FOR SEQ ID NO : 179: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 579 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..579 
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(xi) SEQUENCE DESCRIPTION: SEQ ID' NO: 179:' ' ' 

ACCTGCGGCT TCGCCGACCT CATGGGATAC ATCCCGCTCG TAGGCGCCCC CGTGGGAGGC 60 

• i 

GTCGCCAGAR CTCTGGCGCA TGGCGTCAGG GCTCTGGAAG ACGGGATCAA TTATGCAACA 120 

i 

GGGAATCTTC CTGGTTGCTC TTTCTCTATC TCCCTTCTTG AACTTCTCTC GTGCCTGACT 180 

,1 ' - • 

. q^tccCGCCT CAGCCATCCA CTATCGCAAT GCTTCGGACG GTTATTATAT CACCAATGAT 24 0 

TGCCCGAACT CTAGCATAGT GTATGAAG CC GAGAACCACA TCTTGCACCT TCCGGGGTGT 3 00 

ATACCCTGTG TGAAGACCGG GAATCAGTCG CGGTGCTGGG TGGCTCTCAC CCCCACGCT'G 360 

GCGGCCCCAC ACCTACGTGC TCCGCTTTCG 1 TCCTTACGGG CGCATG.TGGA CdTAATGGTG 420 

GGGGCCGCCA, CGGCATGCTC CpCTTTTTAC ATTGGAGATC TGTGCGGGGG TGTGTTTTTG 480 

GCGGGCCAAC TGTTCACTAT CCGGCCACGC ATTCATGAAA CCACTCAGGA CTGCAATTGC 540 

TCCATCTACT CAGGGCACAT CACGGGT1)INN NNNNNNNNN , 579 



(2) INFORMATION FOR SEQ ID NO: 180: 

i 

(i) SEQUENCE CHARACTERISTICS: 

1 (A) LENGTH: 193 atnino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



'(xi). SEQUENCE DESCRIPTION: SEQ ID NO: 180: 
Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala 
15 10 15 

Pro Val Gly Gly Val Ala Arg Xaa Leu Ala His Gly Val Arg Ala Leu 
20 25 30 

Glu Asp Gly He Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

Ser lie Ser Leu Leu Glu Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 55 60 

Ala He His Tyr Arg Asn Ala Ser Asp Gly Tyr Tyr lie Thr Asn Asp 
65 70 75 80 

Cys Pro Asn Ser Ser lie Va3r Tyr Glu Ala Glu Asn His He Leu His 
85 90 95 

Leu Pro Gly Cys He Pro Cys Val Lys Thr Gly Asn Gin Ser Arg Cys 
100 105 110 
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I 

Trp Val Ala Leu Thr Pro Thr Leu Ala Ala Pro His Leu Arg Ala Pro 

115 120 , < 125 • 

'» ; m 

Leu Ser Ser Leu Arg Ala His Val Asp Leu Met Val Gly Ala Ala Thr 
, 130 135 , 140 

Ala Cys Ser Ala Phe Tjrr lie Gly Asp , Leu Cys Gly Gly Val Phe Leu * 
145 150 155 ' 160 

t- » i 

Ala Gly, Gin Leu Phe Thr lie Arg Pro Arg lie His Glu Thr Thr Gin' 
• 165 ■ . ' 1 170 17$ 

Asp Cys Asn =Cys Ser I}.e Tyr Ser Gly His He ,Thr Gly Xaa Xaa Xaa ' 
180 185 iso 

• Xaa . ■ 

I • 

i . 

i . , : 1 • • 

(2) INFORMATION FOR SEQ ID NO: 181: 

• 1 ' ' ' ' 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : ' 579, base pairs ' 

(B) TYPE: nucleic acid 1 1 

(C) .&TRANDEDNESS: single 

(D) » TOPOLOGY: linear ~ ' 
(ii) MOLECULE TYPE: cDNA.i 

(iii) HYPOTHETICAL: NO ' 
(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1./578 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181: 



GCGTGCGGCT 


TCGQCGATCT 


CATGGQATAC 


ATCCCGCTCG 


TAGGCGCCCC 


CGTGGGTGGC 


60 


GTCGCCAGAG 


CCCTGGCGCA CGGTGTTAGG 


GCTGTGGAGG 


ACGGGATTAA 


CTACGCAACA 


120 


GGGAATCTTC 


CTGGTTGCTC 


TTTCTCTATC 


TNCCTTCTGG 


CACTTCTCTC 


GTGCCTGACT 


180 


GTCCCGGCCT 


CGGCTCAGCA 


CTACCGGAAT 


GTCTCGGGCA 


TCTACCACGT 


CACCAATGAT 


240 


TGCCCGAATT 


CCAGCATAGT 


GTATGAAGCC 


GATCACCACA 


TCATGCACTT 


ACCAGGGTGC 


300 


ATACCCTGCG 


TGAGGACCGG 


GAACGTTTCG- 


CGCTGCTGGG 


TATCTCTGAC 


ACCTACTGTG 


360 


GCTGCTCCCT 


ACCTCGGGGC 


TCCGCTTACG 


TCGCTACGGC 


GGCATGTGGA 


TTTGATGGTG 


420 


GGTGCAGCCA 


CCCTTTGCTC 


TGCCCTCTAC 


GTCGGAGACC 


TCTGTGGAGG 


TGTCTTCCTA 


480 
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1 

GTGGGACAGA TGTTCACCTT CCAGCCGCGC CGCCACTGGA CCACTCAGGA CTGCAACTGC 54 0 

TCCATTTACG TCGGCCACAT CAjCAGGCCAC AGAATGQCT • • 579 

i 

i 

(2) INFORMATION FpR SEQ ID NO: 182: 

' ,1 ' . 

(i) SEQUENCE CHARACTERISTICS: 
• (A) LENGTH:' 193 anp.no acids i ■ . i 

, (B). TYPE: amino acid 

(O) STRANDEDNESS: single 1 ... 

(D) ( TOPOLOGY: linear , 

: , » ' 

(ii) MOLECULE .TYPE: protein 

I • 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182: 

• ; i 

Ala Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala 

1 . 5' ■ ' i lO 1 15 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
, 20 25 30 

Glu Asp biy He Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

* ft - ' 

Ser He Xaa Leu Leu Ala . Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 55 60 

Ala Gin His Tyr Arg Asn Val Ser Gly He Tyr His Val Thr Asn Asp 
65 70 ' 75 80 

Cys Pro Asn Ser Ser He Val Tyr Glu Ala Asp His His He Met His 
85 90 95 

Leu Pro Gly Cys lie Pro Cys Val Arg Thr Gly Asn Val Ser Arg Cys 
100 105 110 

Trp Val Ser Leu Thr Pro Thr Val Ala Ala Pro Tyr Leu Gly Ala Pro 
115 120 125 

Leu Thr Ser Leu Arg Arg His Val Asp Leu Met Val Gly Ala Ala Thr 
130 i35 140 

Leu Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Gly Val Phe Leu 
145 150 155 160 

Val Gly Gin Met Phe Thr Phe Gin Pro Arg Arg His Trp Thr Thr Gin 
165 170 175 

Asp Cys Asn Cys Ser He Tyr Val Gly His He Thr Gly His Arg Met 
180 185 190 

Ala 
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PCT/EP94/01323 



- 1 



(2) INFORMATION FOR SEQ ID NO: 183: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 579 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

i (ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO ' 
i (iii) ANTI- SENSE: NO 



(ix) FEATURE: », 
(A) NAME/KEY: CDS 
(?) LOCATION: 1 1. .579 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 1..57^ 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 183: 

ACC TGC 'GGC TTT GCC GAC CTC, ATG GGA TAC ATC £CG CTC 
Thr Cys .Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu 
15 10 

CCT GTG GGT GGC GTC GCC AGG GCC CTA GAA CAC GGT GTT 
Pro Val Gly Gly Val Ala Arg Ala Leu Glu His Gly Val 
20 25 

GAG GAC GGT ATT AAT TAT GCA ACA GGG AAT CTC CCC GGT 
Glu Asp Gly He Asn Tyr Ala Thr Gly Asn Leu Pro Gly 
35 40 45 

TCT ATC TCC CTC TTG GCA CTT CTT TCG TGC CTG ACT GTT 
Ser He Ser Leu Leu Ala Leu Leu Ser Cys Leu Thr Val 
50* 55 60 

GCC GTC AAC TAT CGC AAC GCC TCG GGC GTC TAT CAT ATC 
Ala Val Asn Tyr Arg Asn Ala Ser Gly Val Tyr His lie 
65 70 75 

TGC CCG AAT TCG AGC ATA GTG TAC GAG GCT GAC TAC CAC 
Cys Pro Asn Ser Ser He Val Tyr Glu Ala Asp Tyr His 
85 90 

CTC CCT GGG TGC TTA CCC TGC GTG AGG GTT GGG AAT CAG 
Leu Pro Gly Cys Leu Pro Cys Val -Arg Val Gly Asn Gin 
100 105 

TGG GTG GCC CTT ACT CCC ACC GTG GCG GCG CCT TAC GTT 
Trp Val Ala Leu Thr Pro Thr Val Ala Ala Pro Tyr Val 



GTA GGC GCC 
Val Gly Ala 
15 

AGG GCT GTG 
Arg Ala Val 
30 

TGC TCT TTT 
Cys Ser Phe 



CCC ACC TCA 
Pro Thr Ser 



ACC AAT GAC 
Thr Asn Asp 
80 

ATC CTA CAC 
He Leu His 
95 

TCA CGC TGC 
Ser Arg Cys 
110 

GGT GCT CCG 
Gly Ala Pro 



4 8 



96 



144 



192 



240 



288 



336 



384 
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115 1 120 125 

t ■ 

CTA GAA T;CC CTC CGG AGT CAT (JTG GAT CTG'ATG GTA GGT GCT GCT ACT 4 32 

Leu Glu Ser Leu Arg Ser His Val Asp Leu . Met Val Gly Ala Ala Thr 
130 13-5 1 140 

i 

GTG TGC TCC GCT CTT TAC ATC GGG GAC CTG TGC GGT GGC GTA TTT TTG 4 80 

Val Cys Ser Ala Leu Tyr lie Gly Asp Leu Cys Gly Gly Val Phe Leu 
145 150 1 155 160 

♦ . ' , . . ■ 

GTT GGT PAG ATG TTT TCT TTC CAG CCG CGA CGC CAC TGG ACC ACG CAG 528 
Val Gly Gin Met PJie Ser Phe Gin Pro Arg ! Arg His Trp Thr Thr Gin ■ 

165 170 175 , 

GAC TGC AAT TGT TCT ATC TAQ GCG GGG CAC GTT ACG GGC CAC AGG ATG 576 
Asp Cys Asn Cys Ser lie Tyr Ala Gly His Val Thr Gly His Arg Met 

180 185 190 

• » ■ 

GCA 1 ' '.' 579 

Ala ' ... 

(2) INFORMATION FOR SE^'IDNO: 184: • 

' i , ' ' 

( i) SEQUENCE CHARACTERISTICS : , 

(A) LENGTH : 193 amino acids 

(B) TYPE: amino acid » 
(D) ' TOCOLOGY: linear 

MOLEpULp TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 184: 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala 
1 5 * f 10 15 

i 

,Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 

20 25 30 

t* t 

Glu Asp Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

Ser lie Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Thr Ser 
50 55 60 

Ala Val Asn Tyr Arg Asn Ala Ser Gly lie Tyr His lie Thr Asn Asp 
65 70 75 80 

Cys Pro Asn Ala Ser lie Val Tyr jGlu Thr Glu Asn His lie Leu His 
85 90 95 

Leu Pro Gly Cys Val Pro Cys Val Arg Thr Gly Asn Gin Ser Arg Cys 
100 105 110 

Trp Val Ala Leu Thr Pro Thr Val Ala Ser Pro Tyr Ala Gly Ala Pro 
115 120 125 

Leu Glu Pro Leu Arg Arg His Val Asp Leu Met Val Gly Ala Ala Thr 
130 135 140 
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Met Cys Ser Ala Leu Tyr lie Gly Asp Leu Cys Gly Gly. Leu Phe Leu' 

145 150 155, i , ,160 

Val Gly Gin Met Phe Thr Phe Gin' Pro Arg Arg His Trp Thr Thr- Gin 
165 .' 170 175 

1 i 

Asp Cys Asn Cys Ser lie Tyr Thr Gly His lie Thr Gly His Arg Met 
180 185 . 190* 



Ala 



t (2) INFORMATION FOR SEQ ID NO: 182: 



(i) SEQUENCE CHARACTERISTICS: ' 

(A) LENGTH: 192 amino i acids , 

(B) TYPE: amino acid », 
(D) TOPOLOGY: linear 

(di) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182: 

Ala Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala 
1 5 10 15 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val, 



\ 



20 25 ' 30 



Glu Asp Gly He Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

Ser lie Ser Phe Trp His Phe Ser Arg Ala * Leu Ser Arg Pro Arg 
50 55 60 

Leu Ser Thr Thr Gly Met Ser Arg Ala Ser Thr Thr Ser Pro Met He 
6Si 70 75 80 

Ala Arg He Pro Ala * Cys Met Lys Pro He Thr Thr Ser Cys Thr 
85 90 95 

Tyr Glri Gly Ala Tyr Pro Ala * Gly Pro Gly Thr Phe Arg Ala Ala 
100 105 no 

Gly Tyr Leu * His Leu Leu Trp Leu Leu Pro Thr Ser Gly Leu Arg 
115 120 125 

Leu Arg Arg Tyr Gly Gly Met Trp He * Trp Trp Val Gin Pro Pro 
130 135 140 

Phe Ala Leu Pro Ser Thr Ser Glu Thr Ser Val Glu Val Ser Ser * 
145 150 155 160 

Trp Asp Arg Cys Ser Pro Ser Ser Arg Ala Ala Thr Gly Pro Leu Arg 
165 170 175 

Thr Ala Thr Ala Pro Phe Thr Ser Ala Thr Ser Gin Ala Thr Glu Trp 
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" 180 185 190 1 



(2) INFORMATION FOR SEQ ID NO: 185: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 579 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) . TOPOLOGY: linear ' 

i 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 

i 

(iii) ANTI-SENSE: NO 



(ix) FEATURE: , 

(A) NAME/KEY: CDS 

(B) LOCATION: 1.-579 

' ' ' . » 

(ix) FEATURE: , , 1 

(A) NAME/ KEY: mat_peptide , 

(B) LOCATION: 1 . . 57£ 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18$: 

t . i i 

ACT TGC GGC TTT GCC GAC CTC ATG GGA TAC ATC CCb CTC GTA GGC GCC 48 
Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala 
15 10 15 

CCC GTG GGT GGC GTC GCC AGA GCC CTG GAA CAT GGT GTT AGG GCT GTG 96 
Pro Val Gly Gly Val Ala Arg Ala Leu Glu His Gly Val Arg Ala Val 
20 25 30 

GAG GAC . GGC ATC AAT TAT GCA ACA GGG AAT CTC CCC GGT TGC TCT TTC 144 
Glu Asp Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

TCT ATO TAC CTC TTG GCA CTT CTC TCG TGC CTG ACT GTT CCC ACC TCG 192 
Ser lie Tyr Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Thr Ser 
50 55 60 

GCC ATC CAC TAT CGC AAT GCC TCG GGC GTC TAC CAC GTC ACC AAT GAC 24 0 

Ala lie His Tyr Arg Asn Ala Ser Gly Val Tyr His Val Thr Asn Asp 
65 70 75 80 

TGC CCG AAC TCG AGC ATA GTG TAC GAG GCC GAC CAC CAC ATC CTA CAC 288 
Cys Pro Asn Ser Ser lie Val Tyr Glu Ala Asp His His lie Leu His 
85 90 95 

CTT CCA GGG TGC TTA CCC TGT GTG" AGG GTT GGG AAT CAG TCA CGT TGT 336 
Leu Pro Gly Cys Leu Pro Cys Val Arg Val Gly Asn Gin Ser Arg Cys 
100 105 110 

TGG GTG GCC CTC TCT CCC ACC GTG GCG GCG CCT TAC ATC GGT GCT CCA . 384 
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Trp Val Ala Leu Ser Pro Thr Val Ala Ala Pro Tyr' lie Gly Ala Pro 

115 ' 120 125 ~ ' 

GTT GAA .TCC TTC CGG AGA CAC ,£TG GAC ATG ATG GTG GGC GCT GCT ACT 432 

Val Glu Ser Phe Arg Arg His Val Asp Met Met Val Gly Ala Ala Thr 
130 135 140 
» ' 

GTG TGC TCC GCT CTC TAT AT? GGG GAC TTG TGT GGT GGC GTA TTC TTG 480 

Val Cys Ser Ala Leu Tyr He Gly Asp Leu <bys Gly Gly Val Phe Leu 
145 , ■. 150 ' 155 ' " 160 



i 



(2) INFORMATION FOR SEQ ID NO: 186: 

t 

(i) SEQUENCE CHARACTERISTICS : • 

(A) LENGTH: 193 amino acids 

(B) TYPE : amino acid 
(J}),» TpPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186: 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala 
1 5 10 15 

Pro Val Gly Gly Val Ala Arg Ala Leu Glu His Gly Val Arg Ala Val 
20 25 30 

Glu Asp Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

Ser He Tyr LeU Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Thr Ser 

50 55 60 

Ala lie His Tyr Arg Asn Ala Ser Gly Val Tyr His Val Thr Asn Asp 
65 70 75 90 

Cys Pro Asn Ser Ser lie Val Tyr Glu Ala Asp His His lie Leu His 
85 90 95 

Leu Pro Gly Cys Leu Pro Cys Val Arg Val Gly Asn Gin Ser Arg Cys 
100 -105 no 

Trp Val Ala Leu Ser Pro Thr Val Ala Ala Pro Tyr He Gly Ala Pro 
115 120 ~ 125 



i 



GTT GGT • CAG ATG TTT TCT TTC CQG CCA CG^ CGC CAC TGG ACT ACG CAG 528 
Val Gly Gin 'Met Phe Ser 1 Phe Arg Pro Arg Arg His Trp Thr Thr Gin' 
165 170 ~ 175 



GAC TGC AAT TGT TCC ATC TAC GCG GGG CAC ATC ACT GGC CAC GGA ATG 576 
Asp Cys Asn Cys Ser He Tyr Ala Gly His lie Thr Gly His Gly Met 
180 ( 185 . ' 190 

i 

GCA . ' , 

Ala ' 579 
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Val Glu Ser Phe Arg Arg Hie Val Asp Met Met Val Gly Ala Ala Thr ' 
130 135 140 1 

Val Cys Ser Ala Leu Tyr ,Ile Gly Asp Leu Cys Gly Gly Val Phe Leu 

145 150 155 160 

i 

i 

Val Gly Gin Met P.he Ser Phe Arg Pro Arg Arg His Trp Thr Thr Gin 
165 1 170 I " 175 

Asp 'Cys Asn Cys Ser. He Tyr , Ala Gly His He Thr Gly His Gly Met 1 
180 185 190 

Ala , . 

: . 1 ' 

(2) INFORMATION FOR SEQ ID NO: 187: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : . 5 7*9 base pairs 

(B) TYPE: nucleic acid' 

(C) STRAND EDNESS : single 

(D) TOPOLOGY 1 :• linear ' » ' 
(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO ' , 

Ciii) ANTI-SENSE: NO 

1 1 
1 I' » , • 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..579 

(ix) FEATURE : ■ 

(A) NAME /KEY : mat_peptide 

(B) LOCATION: 1. .576 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 187: 

ACT TGC GGC TTT GCC GAC CTC ATG GGA TAC ATC CCG CTC GTA GGC GCC 4 8 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala 
1 5 10 15 

CCT GTG GGT GGC GTC GCC AGG GCC CTG GCA CAC GGT GTT AGG GCT GTG 96 
Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 

20 ■ J 25 30 

GAG GAC GGG ATC AAT TAT GCG ACA GGG AAT CTT CCC GGT TGC TCT TTC 144 
Glu Asp Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

TCT ATC TTC CTC TTG GCA CTT CTT ~TCG TGC CTG ACT GTT CCC ACC TCG 192 
Ser lie Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Thr Ser 
50 55 60 

GCC GTC AAC TAT CGC AAT GCC TCG GGC ATC TAT CAC ATC ACC AAT GAC 240 
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Ala Val Asn Tyr Arg Asn Ala Ser Gly lie Tyr His He Thr Asn Asp, 

65 70 75 80 

TGC CCG AAC TCG AGC ATA GTG TAC , G AG ACC GAG CAC CAC ATC CTA CAC 288 
Cys Pro Asn Ser Ser He Val Tyr Glu Thr Glu His His He Leu' His 
85 " 90 95 

CTC CCA GGG TGT TTA CCC TGC GTG AGG GTT GGG' AAT CAG TCA ,CGC TGC 336 
Leu Pro Gly Cys Leu Pro £ys Val Arg Val Gly Asn Gin Ser Arg Cys 
| ' , 100 » ■■ 105 l^o 

TGG GTG GCC CTC ACT CCC ACC GTG GCG GCQ CCT TAC ATC GGC GCT CCG '384 

Trp Val Ala Leu Thr Pro Thr Val Ala 'Ala Pro Tyr He ,Gly Ala Pro 
115 i 120 125 

i 

CTT GAA TCC CTC CGG AGT CAT GTG GAT CTG ATG GTA GGT GCC GCT ACT* 432 

Leu Glu Ser' Leu Arg Ser His Val 'Asp Leu Met Val Gly Ala Ala Thr 

130 135 ■ l i 140 t " ■ 

GCG TGC TCC, GCT CTT TAC* ATC GGA GAC CTG TGC GGT GGC GTA TTT TTG 4 80 

Ala Cys Ser Ala Leu Tyr He Gly Asp Leu Cys Gly Gly Val Phe Leu 
145 , ' ' 150 , 155 , 160 

GTT GGT CAG ATG TTC TCT TTC CAG CCG CGG CGC CAC TGG ACT ACG CAG 52 8 

Val Gly, Gin Met Phe Ser Phe Gin Pro Arg Arg His Trp Thr Thr Gin 
165 170 175 

» • 

GAC TGC AAT TGT TCC ATC TAC GCG GGG CAC GTT ACG GGC CAC AGG ATG, 576 
Asp Cys' Asn Cys Ser He Tyn Ala Gly His Val , tfhr Gly His Arg Met 
* 180 1 185 . 190 



GCA 
Ala 



579 



(2) INFORMATION FOR SEQ ID NO: 188: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 193 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 188: 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala 
15 10 15 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 25 30 

Glu Asp Gly He Asn Tyr Ala Thr -Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

Ser He Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Thr Ser 
50 55 60 
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i) l 

Ala Val Asn Tyr Arg Asn Ala Ser Gly lie Tyr His, lie Thr Asn Asp 
65 70 75* ' *° 

Cys Pro Asn Ser Ser He Val Tyr Glu Thr Glu His His He Leu His 
85 , ( 90 95 

Leu Pro Gly Cys Leu Pro Cys Val Arg Val Gly Asn Gin Ser 'Arg Cys 
100 ■ 105 110 

.1 ' ' . ■ ■ ' ' 

Tjrp Val Ala Leu Thr Pro Thr Val Ala Ala Pro Tyr He Gly Ala Pro 
115 120 • 125 

i i 

Leu Glu Ser Jieu Arg Ser His Val Asp Leu Met Val Gly Ala Ala Thr 
* 130 135 140 

I * | , 

, Ala Cys Ser Ala Leu Tyr lie Gly Asp Leu Cys Gly Gly Val Phe Leu 

145 150 ' *' 155 ' 160 

. i 

Val Gly Gin' Met Phe Ser Phe Gin Pro Arg Arg His Trp Thr Thr Gin . 
' 165 ' 170 175 

. 1 . i 

Asp Cys Asn Cys Ser He Tyr Ala Gly His Val Thr Gly His Arg Met 

180 185 190 , 

Ala . ' i 

♦ • 
\ (2) INFOf!mATION FOR, SE ( Q ID 'NO: 189: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 579 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single , 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..579 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 1..576 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 189: 

ACG TGC GGC TTC GCC GAC CTC ATG GGA TAC ATC CCG CTC GTG GGC GCC 4 8 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala 
1 5 10 15 
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i 

CCC GTT GGG GGC GTC GCC AGG GCC CTG GCG CAT GGC GTC AGG GCT GTG i 96 
Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val 
20 . 25 , 30 

'< -i . 

GAG GAC GGG ATT AAC TAT GCG ACA GGG AAT CTT CCC GGT TGC TCT. TTC ' 144 

Glu Asp Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
'35 40 ' 45 

TCT ATC TTC CTC CTG GCA QTT CTT TCG tGC trC ACT GTC CCA GCG TCA 192 
Ser, He Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 

50 55'' 60 , 1 

GCT GAG CAC TAC CGG AAT GCT TCG GGC ATC TAT CAC ATC ACC AAT 'GAC 240 



Ala Glu His Tyr #rg Asn Ala Ser Gly He Tyr His He Thr Asn Asp 
65 70 1 75 '80 



TQT CCG AAT TCC AGC GTA GTC TAT GAA ACT QAC CAC CAT ATA TTG CAC 288 
Cys Pro Asn Ser »Ser Val Val Tyr Glu Thr' Asp His His He Leu His 
85 i 90 95 

TTG CCG GGG TGC GTA CCC TGC GTG AGG GCC GGG AAC GTG TCT CGT TGC 336 
Leu Pro Gly Cys Val Pro Cys Val Arg Ala Gly Asn Val Ser Arg , Cys 
100 105 no 

TGG ACG CCG GTA ACA CCT ACG GTG GCT GCC GTA TCC ATG. GAC GCT CCG 384 

Trp Thr Pro V^l Thr Pro Thr Val Ala Ala Val Ser Met Asp Ala Pro 

115 , , 120 i!25 ; , 

♦ * 

CTC GAG TCC, TTC CGG CGG CAT |GTG GAC CJA ATG GTA GGT GCG GCC ACC 432 

Leu Glu Ser fche» Arg Arg His Val Asp Leu Met Val Gly Ala Ala Thr 

130 135 - •' 140 

GTG TGT TCT GTC CTC TAT GTT GGA GAC CTC TGT GGA GGT GCT TTC CTA 480 
Val Cys Ser Val Leu Tyr Val Gly Asp Leu Cys Gly Gly Ala Phe Leu 
145 150 - . 155 JL60 

GTG GGG CAG ATG TTC ACC TTC CAG CCG CGT CGC CAC TGG ACC ACG CAG 528 
Val Gly Gin Met Phe Thr Phe Gin Pro Arg Arg His Trp Thr Thr Gin 
165 170 175 

GAT TGT AAT TGC TCC ATC TAT ACT GGC CAT ATC ACC GGC CAC AGG ATG 576 
Asp Cys Asn Cys Ser He Tyr Thr Gly His He Thr Gly His Arg Met 
180 185 190 

GCG 579 
Ala 



(2) INFORMATION FOR SEQ ID NO: 190: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 193 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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i( (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 190: 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala 
1 5 ' 10 , 15 , 

Pro Val Gly Gly Val Ala, Arg Ala Leu Ala His Gly Val Arg Ala Val 

20 .25 ( 30 

i 

Glu Asp Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 

I ?5, » . 40' 45 ' 

i • . t 

Ser lie Phe Leu Leu Ala Leu Leu Ser |Cys» L ^u Thr Val Pro Ala Ser 

50 • • . 55 60 

Ala Glu His Tyr Arg Asn Ala Ser Gly lie Tyr His lie Thr Asn Asp, 
65 70 75 80 

Cys Pro Asn Ser . Ser Val Val Tyr Glu Thr Asp His His He Leu His 
85 . 90 ,95 

Leu Pro Gljy Cys Val Pro C^s Val Arg Ala Gly Asn Val Ser Arg Cys 
' 100 105 1 110 1 

i 

♦ i 

Trp Thr Pro Val Thr Pro Thr Val Ala Ala Val Ser Met Asp Ala Pro , 
' -115 120 125 

i - 

Leu Glu Ser Phe Arg Arg His Val Asp Leu Met Val Gly Ala Ala Thr 
130, 135 140 

\ 1 , .* ' : 

Val Cys Ser Val Leu Tyr Val Gly Asp Leu Cys Gly Gly Ala Phe Leu 
145 150 155 160 

Val Gly Gin Met Phe Thr Phe Gin Pro Arg Arg His Trp Thr Thr Gin 
165 170 , 175 

Asp Cys Asn Cys Ser lie Tyr Thr Gly His He Thr Gly His Arg Met 
180 185 190 

Ala 



(2) INFORMATION FOR SEQ ID NO: 191: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 289 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(ix) FEATURE: 

(A) NAME /KEY: CDS 
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i 

(B) LOCATION: 1...289 1 I 

i 

(ix) FEATURE :' , i 

'* (A) NAME /KEY : mat_>eptide 
(B) LOCATION : 1 . .286 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 191:, 

i 

ATGtAGC ACG AAT CCT AAA CCT/CAA AGA AAA ACC AAA CGT AAfc ACC AAC , 48 

Met Ser, Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 

1 ' ' .,' 5 • i 1C 15 , . 

CGC CGC CCC ATG GAC GTT AAp TTC CCG GGC GGT GGC CAG ATC GTT ,GGT ' 96 

Arg Arg Pro Met A$p Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 ' 25 30 

»' 

GGA GTT TAC TTG TTG CCG CGC AGG GGC CCC AGG TTG GGT GTG CGC GCG ' 144 

Gly Val Tyr Leu Leu Pro 1 Arg Arg Gly Pro Arg Leu Gly Val Arg Ala, 
35 1 40 i 45 

ACT AGG AAG ACT TCG . GAG CGG TCG CAA CdT jCGT GGG AGA CGT CAG 'CCT 192 
Thr Arg Lys .Thr ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 .55 60 

> 1 

ATC CCC AAG GGA CGT CGA TCT GAG GGA AGG TCC TGG GCT CAG CCC GGG 24 0 

lie Pro Lys Ala* Arg Arg Ser Glu Gly Arg Ser Trp Ala Gin Pro Gly • ' 
65' , 70 75 ■ 80 

TAC CCX TGG icT 1 CTT TAC GGT AAT GAG GGT TGT 0GG TGG GC^ GGA TGG G 289 
Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
. 85 90 95 

(2) INFORMATION FOR SEQ ID NO: ±92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 192: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
1 5 10 15 

Arg Arg Pro Met Asp Val Lys Phe' Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 

He Pro Lys Ala Arg Arg Ser Glu Gly Arg Ser Trp Ala Gin Pro Gly 
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65 70 75 80 ' 

Tyr Pro , Trp Pro Leu Tyr Gly M Asn Glu Gl^ Cys Gly Trp Ala Gly Trp 

85 ' ■ 90 9'5 • • 



(2) INFORMATION FOR SEQ ID 1 NO: 193: 

t 

• (i) SEQUENCE 'CHARACTERISTICS: 

( (A) LENGTH: 498 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
(D j TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 



(iii) HYPOTHETICAL : NO ' 



(iii) ANTI- SENSE: NO 



(ix) FEATURE : • i 

(A) NAME/KEY: CDS ( " 

(B) LOCATION: 1..498 

(ix) FEATURE: ( ' 

* , (A) NAME/KEY: mat ^peptide 

(B) (l LOCATION: 1..495 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 193: 

ATG AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC AAA CGT AAC ACC AAC 48 
Met Ser Thr Asn P,ro Lys Pro ' Gin Arg Lys Thr Lys Arg Asn Thr Asn 
1 5 10 15 

CGC CGC CCT ATG GAC GTA AA6 TTC CCG GGC GGT GGA CAG ATC GTT GGC 96 
Arg Arg Pro Met A.sp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 

GGA GTT TAC TTG TTG CCG CGC AGG GGC CCC CGG TTG GGT GTG CGC GCG 144 
Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 ' 40 45 

ACT CGG AAG ACT TCG GAG CGG TCG CAA CCT CGT GGC AGG CGT CAA CCT 192 
Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 ; 60 

ATC CCC AAG GCG CGC CGG TCC GAG GGC AGG TCC TGG GCG CAA GCC GGG 240 
lie Pro Lys Ala Arg Arg Ser Glu Gly Arg Ser Trp Ala Gin Ala Gly 
65 70 75 80 

TAC CCC TGG CCC CTC TAT GGC AAT "GAG GGC TGT GGG TGG GCA GGG TGG 288 
Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
85 90 95 

CTC CTG TCT CCT CGC GGC TCT CGG CCA TCT TGG GGC CCA AAT GAT CCC 336 
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, Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Asn Asp Pro 
100 105 no 

CGG CGG AGA TCG CGC AAT CTG GGT AAG GTC ATC GAT ACC CTG ACG TGC 384 
Arg Arg Arg Ser Arg Asn Leu Gly 'Lys Val lie Asp Thr Leu Thr Cys 
115 120 125 

' i 

GGC TTC GCC GAC CTC ATG GGA TAC ATC CCG CTC GTG GGC GCC CCC GTC 432 
Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala Pro Val 

| ' 130 , , 1 135 ' "0 , 

GGG GGC GTC GCC AGG GCC CTG GCG CAT GGC GTC AGG GCT GTG GAG GAC '480 
Gly Gly Val Tyla Arg Ala Leu Ala His Gly Val Arg Ala Val Glu Asp 



14 5 , 150 155 



160 



GGG ATT AAC TAT CGA CAG • . 498 

Gly lie Asn' Tyr Arg Gin i . , 

.165 . , ■ 

i 

i 1 * , 

(2) INFORMATION FOR SEQ ID NO: 194: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 166 amino acids 

(B) TYPE: amino acid • 
(D) TOPOLOGY: linear 

• ■/ » 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 194:, 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
1 5 io 15 

Arg Arg Pro Met Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 

Gly ,Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 

lie Pro Lys Ala Arg Arg Ser Glu Gly Arg Ser Trp Ala Gin Ala Gly 
65 70 75 80 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
85 90 " 95 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Asn Asp Pro 
100 105 no 

Arg Arg Arg Ser Arg Asn Leu Gly Lys Val He Asp Thr Leu Thr Cys 
115 120' 125 

Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala Pro Val 
130 135 140 
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*' Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val Glu ' Asp • 
145 150 155 160 

Gly He Asn Tyr Arg Gin 
165 



(2) INFORMATION FOR SEQ ID NO: 195: 

i 

\ ' (i) SEQUENCE CHARACTERISTIC^ : 

; | . (A) LENQTH: 579 base pairs 

(B) TYPE: nucleic acid 

(C) 1 STRANDEDNESS : single 

(D) 1 TOPOLOGY: linear 

t 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

. i 

(iii) ANTI-SENSE: NO 



(ix) FEATURE: ■ , 

(A) NAME/KEY: CDS , 
- (B) LOCATION: 1..579, 

i ■ 

(ix) FEATURE: 

, (A) NAME/kEY: matjpeptide 
1 (B) LOCATION: 1..576 

\ ' i 

i ' 1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 195: 

ACG TGC GGA TTC GCC GAC CTC GTG GGG TAC ATC CCG , CTC GTA GGC GGC 48 
Thr Cys Gly Phe Ala Asp Leu Val Gly Tyr He Pro Leu Val Gly Gly 
1 5 10 15 

CCC GTT GGG GGC GTC GCA AGG GCT CTC GCA CAT GGT GTG AGG GTT CTT 96 
Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu 
20 25 30 

GAG GAC GGG GTG AAT TAT GCA ACA GGG AAT CTG CCT GGT TGC TCT TTC 144 
Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

TCT ATC TTC ATT CTT GCA CTT CTC TCG TGC CTC ACT GTC CCG GCC TCT 192 
Ser He Phe He Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 
50 55 60 

GCA GTT CCC TAC CGA AAT GCC TCT GGG ATC TAT CAT GTC ACC AAT GAT 24 0 

Ala Val Pro Tyr Arg Asn Ala Ser Gly He Tyr His Val Thr Asn Asp 
65 70 75 80 

TGC CCA AAC TCT TCC ATA GTC TAT "GAG GCA GAT GAT CTG ATC CTA CAC 288 
Cys Pro Asn Ser Ser He Val Tyr Glu Ala Asp Asp Leu He Leu His 
85 90 95 

GCA CCT GGC TGC GTG CCT TGT GTC AGG AAA GAT AAT GTG AGT AGG TGC 336 
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Ala Pro Gly Cys Val Pro Cys Val Arg Lys Asp Asn Val Ser Arg Cys 1 
100 105 110 t ■ 

. i ' ' 

TGG GTC ''CAA ATT ACC GCC ACG "'CTG TCA GCC CCG AGC TTC GGA GCA GTC , 384 

Trp Val Gin lie Thr Pro Thr Leu Ser Ala , Pro Ser Phe Gly Ala Val 
115 120 ,125 

♦ 

ACG GCT CCC CTT CGG AGA GCC GTT GAT TAC JTG GTG GGA GGG GOT GCC 432 
Thr Ala Pro Leu Arg Arg Ala Val Asp Tyr Leu Val Gly Gly Ala Ala 

♦ 130 '» , • 135 ( , • 140 1 , 

CTC TGc'tCCGCG TTA TAC, GTT G&A GAC GCG TGT GGG GCA CTA TTT TTG, 480 

Leu Cys Ser Ala Leu Tyr Val Gly Asp Ala Cys Gly Ala Leu Phe Leu 

145 150 j 155 ■ ,160 ' 

GTA GGC CAA ATG TTC ACC TAT AGG CCT CGC GAG' CAT GCT ACG GTG CAG 528 
Val Gly Gin Met Phe Thr Tyr Arg Pro Arg Gin His Ala Thr Val Gin 

0.65 1 ' • 170 ' 175 

GAC TGC AAC TGT TCC ATC 'TAC AGT GGC CAC GTC ACC GGC CAT CAG ATQ 576 
Asp Cys Asn Cys Ser lie 'tyr Ser 'Gly His Val Thr Gly His Gin Met 
180 185 190 

. i i ' 

GCA , , ( 579 

Ala ' 



i i 

(2) ■ INFORMATION FOR SEQ ID NO: 196: 
i ♦ 
'(i) S^QUfeNCE CHARACTERISTICS: , 

(A) LENGTH: 193 amino acids 

(B) TXPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein* » 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 196: 

Thr Cys Gly Phe Ala Asp Leu Val Gly Tyr lie Pro Leu Val Gly Gly 
1 5 10 15 

Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu 
20 25 30 

Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

Ser lie Phe lie Leu Ala Leu Leu' Ser Cys Leu Thr Val Pro Ala Ser 
50 55 ' 60 

Ala Val Pro Tyr Arg Asn Ala Ser Gly lie Tyr His Val Thr Asn Asp 
65 70 75 80 

Cys Pro Asn Ser Ser lie Val Tyr "Glu Ala Asp Asp Leu lie Leu His 
85 90 95 

Ala Pro Gly Cys Val Pro Cys Val Arg Lys Asp Asn Val Ser Arg Cys 
100 105 110 
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i 

i 

i 

Trp Val Gin lie Thr Pro Thr Leu Ser Ala Pro Ser Phe Gly Ala Val 
,,115 . 120 ' 125 

Thr Ala Pro Leu Arg Arg Ala Val Asp Tyr Leu Val Gly Gly Ala Ala 
130 , 135 140 ' 

Leu Cys Ser Ala Leu Tyr Val Gly Asp A^a <pys Gly Ala Leu Phe Leu 
145 ' 150 ' 155 160 

. ■ • ■ ,. 

Val Gly, Gin Met Phe Thr Tyr Arg Pro Arg Gin His Ala Thr Val Gin 
' 165 i 170' 175 , 

i 

Asp Cys Asn Cys Ser lie Tyr Ser Gly His Val Thr Gly His Gin • Met 
180 '. 185 190 

Ala 

(2) INFORMATION FOR SEQ lb NO: 197: 

(i) SEQUENCE CHARACTERISTICS: « ' ' 

(A) LENGTH:, 579 base pairs j 

(B) TYPE: nucleic acid ( 

(C) f3TRANDEDNESS : single 

(D) 'TOPOLOGY: linear "' , 

' (ii) MOLECULE TYPE: cDNA 

i '•» 

(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



(ix) FEATURE 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..579 

(ix) FEATURE:, 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 1..576 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 197: 

ACT TGC GGC TTT GCC GAC CTC A*G GGA TAC ATC CCG CTC GTA GGC GCC 48 
Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala 
1 5 J 10 15 

CCC GTG GGT GGC GTC GCC AGA GCC CTG GAA CAT GGT GTT AGG GCT GTG 96 
Pro Val Gly Gly Val Ala Arg Ala Leu Glu His Gly Val Arg Ala Val 
20 25 30 

GAG GAC GGC ATC AAT TAT GCA ACA ~GGG AAT CTC CCC GGT TGC TCT TTC 144 
Glu Asp Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
35 40 45 

TCT ATC TAC CTC TTG GCA CTT CTC TCG TGC CTG ACT GTT CCC ACC TCG 192 
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Ser He Tyr Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Thr Ser ■ 

50 55 60 1 

GCC ATC CAC TAT CGC AAT GCC TCG , GGC GTC TAC CAC GTC ACC AAT GAC 24 0 

Ala He His Tyr Arg Asn fila Ser Gly Val Tyr His Val Thr Asn Asp 

65 70 75 80 

i 

TGC CCG AAC TCG AGC ATA GTG TAC GAG GCC GAC* CAC CAC ATC »CTA CAC 288 
Cys Pro Asn Ser Ser He Val Tyr Glu Ala Asp His His lie Leu His 
I ' , , 85 90 i 95 

CTT CCA GGG TGC TTA CCC TGT GTG AGG GTT GGG AAT CAG TCA CGT TGT *336 
Leu Pro Gly Cys Leu Pro Cys Val Arg Val Gly Asn Gin ,Ser Arg Cys 
100 105 no 

i 

TGG GTG GCC CTC TCT CCC ACC GTG GCG GCG CCT TAC ATC GGT GCT CCA 384 
Trp Val Ala' Leu Ser Pro Thr Val 'Ala Ala Pro Tyr He Gly Ala Pro 
115 1'20'« «125 

GTT GAA TCO TTC CGG AGA CAC GTG GAC ATG ATG GTG GGC GQT GCT ACT 432 
Val Glu Ser Phe Arg Arg His Val Asp Met Met Val Gly Ala Ala Thr 
130 ' ' 135 • 140 i 

GTG TGC TCC GCT CTC TAT ATT GGG GAC TTG TGT GGT GGC GTA TTC TTG 480 
Val Cys, Ser Ala Leu Tyr He Gly Asp Leu Cys Gly Gly Val Phe Leu 
145 ■ 150 155 160 , 

GTT GGT CAG ATG TTT TCT TTC CGG CCA CGA CGC CAC TGG ACT ACG CAG' 528 
Val Gly' Gin Met Phe Ser Phe Arg Pro Arg Arg 'His Trp Thr Thr Gin 
\ 165' 170 1 ■ . 175 

GAC TGC AAT TGT TCC ATC TAC GCG GGG CAC ATC ACT GGC CAC GGA ATG 576 
Asp Cys Asn Cys Ser He Tyr Ala Gly His He Thr Gly His Gly Met 
180 185 190 



GCA 
Ala 



579 



(2) INFORMATION FOR SEQ ID NO: 198: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 193 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 198: 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala 
15 10 15 

Pro Val Gly Gly Val Ala Arg Ala Leu Glu His Gly Val Arg Ala Val 
20 25 30 

Glu Asp Gly lie Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 
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35 '40 45 

• I 

Ser He fyr Leu Leu Ala Leu J^eu Ser Cys Leu Thr Val Pro Thr Ser 

50 ■ .60 ' 1 

Ala He pis Tyr Arg Asn Ala Ser Gly Val Tyr His Val Thr Asn Asp 
65 70 75 80 

Cys Pro Asn Ser Ser He Val Tyr Glu Ala Asp His His lie Leu His 

♦ 85. ' , . 90 . 95 • ( i 

Leu Pro Gly 'Cys Leu Pro Cys Val Arg Val' Gly Asn Gin Ser Arg Cys • 

100 105 110 , 

\ i 

Trp Val Ala Leu Ser Pro Thr Val Ala Ala Pro . Tyr He Gly Ala Pro 
115 120 125 

* ' ' 

Val Glu Ser Phe Arg Arg His Val Asp Met Met Val Gly Ala Ala Thr 

130 135 ' ■ 140 

1 » - ■ 

Val Cys Ser Ala Leu Tyr lie Gly Asp Leu Cys Gly Gly Val Phe Leu 
145 15b 1 155 160 

Val Gly Gin Met Phe Ser Phe Arg Pro Arg Arg His Trp Thr Thr Gin 
, 165 170 175 

Asp Cys Asn Cys Ser lie Tyr Ala Gly His He Thr Gly His Gly Met 
180 185 190 

r i . • 

Ala . . • 



(2) INFORMATION FOR SEQ ID NO: 199: 

( i ) SEQUENCE . CHARACTERISTICS : 

(A) LENGTH: 14 70 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS :' single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO . 



(ix) FEATURE: j 

(A) NAME/KEY: CDS 

(B) LOCATION: 2 . . 1470 

(ix) FEATURE: 

(A) NAME /KEY: mat_peptide 

(B) LOCATION: 2.. 1467 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 199: 
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, A TCA CCA CCG GAG CTT CTA TCA CAT ACT CCA CTT ACG GCA AGT TCC 4 6 

Ser Pro Pro Glu Leu Leu Ser His Thr Pro Leu Thr Ala Ser Ser ' 
1 5 10 . . 15, 

TTG CTG ATG GAG GGT GTT CAG GCG' GCG CGC ATG ACG TGA TCA TAT GCG 94 
Leu Leu Met Glu Gly Val 'Gin Ala Ala Arg Met Thr * Ser Tyr Ala 

20. ' , 25 30 

i 

ACG AGT GCC ATT CCC AGG ACG CCA CCA CCA TTC TTG GGA TAG ' GCA CTG 142 
,Thr Ser Ala lie Pro Arg ftir Prq Pro Pro Phe Leu Gly * Ala Leu 

1 , ' 1 35 40 . is 

» . ■ , * 

TCC TTG ACC AGG CAG AGA CGG CTG GAG • CTA GGC TCG TCG TCT TGG CCA 190 
Ser Leu Thr Arg Gin Arg Arg Leu Glu Leu Gly Ser Ser ' Ser Trp Pro 
50 55 60 

» 

CGG CCA. CCC CTC CCG GCA GTG TGA , CAA CGC CCC ACC CCA ACA ,TCG AGG 238 
Arg Pro Pro Leu Pro Ala Val t * !( Gin Arg Pro Thr Pro Thr Ser Arg 
65 70 75* 

. i - 

AAG TGG CCC TGC CTC AGG £GG GGG AGG TTC CCT TCT ACG GCA GAG CCA 286 
Lys Trp Pro Cys Leu Arg Arg Gly Arg ^he Pro Ser Thr Ala Glu Pro 
80 ' 85 90 ' 95 

' . • ■ ' 

TTC CCC TTG CTT TTA TAA AGG GTG GTA GGC ATC TCA TCT TCT GCC ATT • 334 

Phe Pro Leu Leu Leu * Arg Val Val Gly lie Ser Ser Ser Ala lie 

100 ■ ' 105 no 1 

CCA AGA j^AA AAT GTG ATG AAC TCG CCA AGC AAC , TGA CCA GCC TGG GCG 382 
Pro Arg I*ys Asn Val Met Asn Ser Pro Ser Asn ' * Pro Ala Trp Ala 
* 115 120 125 

TGA ACG CCG TGG CAT ATT ATA GAG GTC TAG ACG TCG CCG TCA TAC CCA 430 
* Thr Pro Trp His lie He Glu Val * Thr Ser Pro Ser Tyr Pro 
130 135 . 140 

CAA CAG GAG ACG TGG TCG TGT GCA GCA CCG ACG CGC TCA TGA CGG GAT 478 
Gin, Gin Glu Thr Trp Ser Cys Ala Ala Pro Thr Arg Ser * Arg Asp 
145 150 155 

TCA CCG GCG ACT TTG ATT CTG TCA TAG ACT GCA ACT CCG CCG TCA CTC 526 
Ser Pro Ala Thr Leu He Leu Ser * Thr Ala Thr Pro Pro Ser Leu 
160 ; 165 170 175 

AGA CGG TGG ACT TCA GTC TGG ATC CCA CTT TTA CCA TTG AGA CTA CCA 574 
Arg Arg Trp Thr Ser Val Trp lie Pro Leu Leu Pro Leu Arg Leu Pro 
180 185 190 

CAG TGC CCC AGG ACG CAG TGT CCA GAA GCC AGC GTT GGG GCC GCA CGG 622 
Gin Cys Pro Arg Thr Gin Cys Pro Glu Ala Ser Val Gly Ala Ala Arg 
195 200 205 

GGA GAG GTA GGC ACG GCA TAT ACC GGT ATG TCT CGG CTG GAG AGA GAC 670 
Gly Glu Val Gly Thr Ala Tyr Thr- Gly Met Ser Arg Leu Glu Arg Asp 
210 215 220 

CGT CTG GCA TGT TCG ACT CCG TGG TGC TCT GTG AGT GCT ACG ATG CCG 718 
Arg Leu Ala Cys Ser Thr Pro Trp Cys Ser Val Ser Ala Thr Met Pro 
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- i 

* 225 230 235 

GAT GTG CAT GGT ACG ATC TGA CTC CTG CCG AGA ' CTA CCG TGa! GGT* TGC *766 
Asp Val His Gly Thr lie * Leu 'Leu Pro Arg Leu Pro * Gly Cys 
240 245 . 250 255 

GCG CTT ACT AAA CAC CCC CGG GCT CCC TGT CTG TCA GGA CCA TTT GGA 814 
Ala Leu Thr Lys His Pro Arg Ala Pro Cys Leu Ser Gly Pro Phe Gly 
260 ■ 265 270 

\ . I 

CTG GGA GGG GGT GTT CAC GGG GCT CAC TAA CAT CGA CGC TCA CAT 862 
lie Leu Gly Gly Gly Val His Gly Ala His- * His Arg Arg Ser His 
275 280 1 285 

l 

GfcT GTC ACA GAC CAA ACA GGG TGG GGA GAA TTT CCC ATA CCT TGT AGC, 910 
Ala Val Thr Asp Gin Thr Gly Trp Gly Glu Phe Pro He Pro Cys Ser 
290 ' ' 295 ' 30,0 

■ »• « 

GTA CCA AGC AAC AGT CTG, TGT TCG CGC GAA AGC GCC CC£ CCC CAG CTG 958 
Val Pro Ser'Asn Ser Leu Cys Ser Arg Glu Ser Ala Pro Prp Gin Leu . 
365 310 315 

' ' ' ' 

GGA CAC AAT GTG GAA ATG CAT GCT CCG TCT CAA ACC GAC TTA ACT GGC 1006 
Gly His Asn Val Glu Met His Ala Pro Ser Gin Thr Asp Leu Thr Gly , 
320 . 325 , 330 335 

i 

CCT ACT CCC CTC TTG TAC AGG bTG GGG CCC GTC CAG AAT GAG ATC ACA 1054 
Pro Thr ( Pro Leu Leu Tyr Arg Leu Gly Pro Val G(ln Asn Glu He Thr' 
1 340 ' 345 ' 350 

CTG ACG CAC CCC ATC ACC AAG TAC ATT ATG GCT TGC ATG TCT GCG GAC 1102 

Leu Thr His Pro He Thr Lys Tyr lie Met Ala Cys Met Ser Ala Asp 

355 360 365 

i 

TTG GAG GTC ATT ACC AGC ACT TGG GTT CTG GTG GGG GGC GTT GTG GCG 1150 

Leu Glu Val He Thr Ser Thr Trp Val Leu Val Gly Gly Val Val Ala 
370 375 380 

GCC CTG GCG GCC TAC TGC TTG ACG GTG GGT TCG GTA GCC ATA GTC GGT 1198 
Ala Leu Ala Ala Tyr Cys Leu Thr Val Gly Ser Val Ala He Val Gly 
385 390 395 

AGG ATC ATC CTC TCT GGG AAA CCT GCC ATC ATT CCC GAT AGG GAG GTA 1246 
Arg lie lie Leu Ser Gly Lys Pro Ala He He Pro Asp Arg Glu Val 
400 405 410 415 

TTA TAC CAG CAA TTT GAT GAG ATG GAG GAG TGC TCG GCC TCG TTG CCC 12 94 

Leu Tyr Gin Gin Phe Asp Glu Met Glu Glu Cys Ser Ala Ser Leu Pro 
420 425 430 

TAT ATG GAC GAA ACA CGT GCC ATT GCC GGA CAA TTC AAA GAG ATA GTG 1342 
Tyr Met Asp Glu Thr Arg Ala lie Ala Gly Gin Phe Lys Glu Lys Val 
435 440 445 

CTC GGC TTC ATC AGC ACG ACC GGC CAG AAG GCT GAA ACT CTG AAG CCG 13 90 

Leu Gly Phe lie Ser Thr Thr Gly Gin Lys Ala Glu Thr Leu Lys Pro 
450 455 460 
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GCA GCC ACG TCT GTG TGG A^C AAG GCT GAG CAG TTC ,TGG CCA CAT ivCA 1 143 8 

Ala Ala Thr Ser Val Trp Asn Lys Ala Glu Gin Phe Trp Pro His Thr 
465 470 , i 475 

1 •' 1 

TGT GGA ACT TCA TCA GTG GGA TAC AAT AAT AG 1470 

Cys Gly Thr Ser Ser Val Gly Tyr Asn Asn , 
480 ' 1 485 



(2), INFORMATION FOR SEQ ID NO: 197: 

(i*) SEQUENCE CHARApTERI S T I CS : i 

(A) LENGTH: 1485 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNES^ : single 

(D) TOPOLOGY: linear 

» 

(ii) MOLECULE TYPE: cDNA * , 



(ix) FEATURE: 

(A) NAME/KEY c CDS 

(B) LOCATION: 1..1485 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 197: 

i 

i i ,i 

TGTGCCAGGA CCATCACCAC CGGAGCTTCT ATCACATACT C^ACTTACGG CAAGTTCCTT 60 

i i '■; 

GCTGATGGAG G&TGTTCAGG CGGCGCGCAT GACGTGATCA TATGCGACGA GTGCCATTCC 1.20 

.i . 1 

CAGGACGCCA CCACCATTCT TGGGATAGGC ACTGTCCTTG ACCAGGCAGA GACGGCTGGA 180 

GCTAGGCTCG TCGTCTTGGC CACGGCCACC CCTCCCGGCA GTGTGACAAC GCCCCACCCC 24 0 

AACATCGAGG AAGTGGCCCT GCCTCAGGAG GGGGAGGTTC CCTTCTACGG CAGAGCCATT 300 

CCCCTTGCTT TTATAAAGGG TGGTAGGCAT CTCATCTTCT GCCATTCCAA GAAAAAATGT 360 

GATGAACTCG CCAAGCAACT GACCAGCCTG GGCGTGAACG CCGTGGCATA TTATAGAGGT 420 

CTAGACGTCG CCGTCATACC CACAACAGGA GACGTGGTCG TGTGCAGCAC CGACGCGCTC 48 0 

ATGACGGGAT TCACCGGCGA CTTTGATTCT GTCATAGACT GCAACTCCGC CGTCACTCAG 540 

ACGGTGGACT TCAGTCTGGA TCCCACTTTT ACCATTGAGA CTACCACAGT GCCCCAGGAC 600 

GCAGTGTCCA GAAGCCAGCG TTGGGGCCGC ACGGGGAGAG GTAGGCACGG CATATACCGG 660 

TATGTCTCGG CTGGAGAGAG ACCGTCTGGC ATGTTCGACT CCGTGGTGCT CTGTGAGTGC 720 

TACGATG CCG GATGTGCATG GTACGATCTG ACTCCTGCCG AGACTACCGT GAGGTTGCGC 780 

GCTTACNTAA ACACCCCCGG GCTCCCTGTC TGTCAGGACC ATTTGGAATT CTGGGAGGGG 84 0 

GTGTTCACGG GGCT CACTAA CATCGACGCT CACATGCTGT CACAGACCAA ACAGGGTGGG 900 

GAGAATTTCC CATACCTTGT AGCGTACCAA GCAACAGTCT GTGTTCGCGC GAAAGCGCCC 960 
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i 

i 

CCCCCCAGCT GGGACACAAT GTGGAAATGC ATGCTCCGTC TCAAACCGAC NlTAACTGGC 102 0 

• i ' 

'I ' H 

CCTACTCCCC TCTTGTACAG' GCTGGGGCCC GTC CAGAATG AGATCACACT GAC6CACCCC '108 0 
ATCACCA^GT ACATTATGGC TTGCATGTCT GCGGACTTGG AGGT'CATTAC CAGCACTTGG 1140 

GTTCTGGTGG GGGGCGTTGT GGCGhcCCTG. GCGGCCTACT GCTTGACGGT GGGTTCGGTA 12 00 

i ■ ■ 

GCCATAGTCG GTAGGAtfCAT CCTCTQTGGG AAACCTGCCA TCATTCCCGa' TAGGGAGGTA 1260 

TTATACCAGC AATTTGATGA GATGGAGGAG TGCTCGfecCT CGTTGCCCTA TATGGACGAA 1320 

i 

ACACGTGCCA TTGCCGGACA ATTCJVAAGAG AAAGTG CTCG GCTTOATCAG CACGACCGGC 1380 

CAGAAGGCTG AAACTCTGAA GCCGGCAGCC ACGTCTGTGT GGAACAAGGC TGAGCAGTTC 144 0 

i" ' ■ 

TGGNCCACAT ACATG*TGGAA CTTCATCAGT GGGATACAAT AATAG 1485 
(2) INFORMATION FOR SEQ lb NO: 198: 

(i) SEQUENCE CHAFlACTERISTtCS : * 

(A) JLENGTH : ( 4 84 amino acids 

(B) TYPE: amino acid ( , 

(C) gTRANDEDNESS : single 

(D) 1 TOPOLOGY : linear '* , 
(ii) MOLiriCULE TYPE: protein 

i » 

i f i ' , ' 

■' ' ■ ' 

_(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 198: 

Cys Ala Arg Thr lie Thr Thr Gly Ala Ser lie Thr Tyr Ser Thr Tyr 
1 ^ 5 , ' , 10 15 

Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala His Asp Val 
20 25 30 

He He Cys Asp Glu Cys His Ser Gin Asp Ala Thr Thr He Leu Gly 
35 40 45 

He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val 
50 55 60 

Val . Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Thr Pro His Pro 
65 70 75 80 

Asn He Glu Glu Val Ala Leu Pro Gin Glu Gly Glu Val Pro Phe Tyr 
85 90 95 

Gly Arg Ala He Pro Leu Ala Phe He Lys Gly Gly Arg His Leu He 
100 105 110 

Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Lys Gin Leu Thr 
115 120 125 

Ser Leu Gly Val Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ala 
130 135 140 



BNSDOC1D: <WO 9425601 A2J_> 



SUBSTITUTE SHEET (RULE 26) 



WO 94/25601 PCT/EP94/01323 

254 

Val lie Pro Thr Thr Gly Asp Val Val Val Cys Ser Thr Asp Ala Leu 

145 150 . , 155 " i ... 160 , 

Met Thr (Sly Phe Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Ser 
16$ 170 175 

1 i 

Ala Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He 
180 . 185 190 

•1 , Glu'Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Trp 
195 200 205 

■' ' ' ' 

' * Gly Arg ( Thr Gly Arg Gly Arg His Gly He Tyr Arg Tyr Val Ser Ala 

I » 210 215 220 

i 

I Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Val Val' Leu Cys Glu Cys 

225 230, i, . 23,5 " 240 

Tyr Asp Ala Gly C^s Ala Trp Tyr Asp Leu Thr Pro Ala Glu Thr Thr 
245, 250 ' 255 

t' ' » , 

Val Arg Leu Arg Ala Tyr Xaa Asn Thr Pro Gly Leu Pro Val Cys Gin 
260 265 ' 270 

i 

Asp His Leu Glu Phe Trp. Glu Gly Val Phe Thr . Gly Leu Thr Asn lie 
275 , 280 285 ' 

^ Asp, Ala His Met Leu Ser Gin Thr Lys Glp' Gly Gly Glu Asn Phe Pro 

>290 • 295 , 300 

Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Val Arg Ala Lys Ala Pro 
305 310 315 320 

Pro Pro Ser Trp Asp Thr Met Trp Lys Cys Met Leu Arg Leu Lys Pro 
325 330 335 

. Xaa Leu Thr Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Pro Val Gin 
340 345 350 

Asn Glu He Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala Cys 
355 360 365 

Met Ser Ala Asp Leu Glu Val He Thr Ser Thr Trp Val Leu Val Gly 
370 375 380 

Gly Val Val Ala Ala Leu Ala Ala Tyr Cys Leu Thr Val Gly Ser Val 
385 390 395 400 

Ala He Val Gly Arg He He Leu Ser Gly Lys Pro Ala He He Pro 
405 410 415 

Asp Arg Glu Val Leu Tyr Gin Gin Phe Asp Glu Met Glu Glu Cys Ser 
420 - 425 430 

Ala Ser Leu Pro Tyr Met Asp Glu Thr Arg Ala He Ala Gly Gin Phe 
435 440 445 
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Lys Glu Lys Val Leu Gly Phe lie Ser Thr Thr Gly Gin Lys Ala Glu 



450 



455 



460 



Thr Leu Lys Pro Ala Ala Thr Ser Val,Trp Asn Lys Ala Glu Gin Phe 
465 . 470 475 480 

Trp Xaa Thr Tyr 



\ (2) INFORMATION FOR SEQ ID NO: 199: 

• i 

i . 

(i) SEQUENCE CHARACTERISTICS: 

(A) > LENGTH: 1485* base pairs 

(B) i TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA ' 



(ix) FEATURE: 

'(A) NAME/KEY: CDS 
(B) LOCATION: 1..1495 



(xi)' SEQUENCE DESCRIPTION: SEQ ID NO: 199: , 

TGTGCCAGGA CCATCACdAC CGGAGCTTCT ATCACATACT CJCACTTACGG CAAGTTCCTT 60 

GCTGATGGA<j , GGTGTTCAGG CGGCGCGTAT GACGTGATCA TAfGCGACGA GTGCCATTCC 12 0 

CAGGACGCCA CCACCATTCT TGGGATAGGC ACTGTCCTTG ACCAGGCAGA GACGGCTGGA 18 0 

GCTAGGCTCG TCGTCTTGGC CACGGCCACC CCTCCCGGCA GTGTGACAAC GCCCCACCCC 24 0 

AACATCGAGG AAGTGGCCCT GCCTCAGGAG GGGGAGGTTC CCTTCTACGG CAGAGCCATT 3 00 

CCCC'TTGCTT TTATAAAGGG TGGTAGGCAT CTCATCTTCT GCCATTCCAA GAAAAAATGT 360 

GATGAACTCG CCAAGCAACT GACCAGCCTG GGCGTGAACG CCGTGGCATA TTATAGAGGT 420 

CTAGACGTCG CCGTCATCCC CACAGCAGGA GACGTGGTCG TGTGCAGCAC CGACGCGCTC 4 80 

ATGACGGGAT TCACCGGCGA CTTTGATTCT GTCATAGACT GCAACTCCGC CGTCACTCAG 54 0 

ACGGTGGACT TCAGTCTGGA TCCCACTTTT ACCATTGAGA CTACCACAGT GCCCCAGGAC 60 0 

GCAGTGTCCA GAAGCCAGCG TAGGGGCCGC ACGGGGAGAG GTAGGCACGG CATATACCGG 660 

TATGTCTCGG CTGGAGAGAG ACCNTCTGAC ATGTTCGACT CCGTGGTGCT CTGTGAGTGC 72 0 

TACGATGCCG GATGTGCGTG GTATGATCTG ACTCCTGCCG AGACTACCGT GAGGTTGCGC 78 0 

GCTTACATAA ACACCCCCGG GCTCCCTGTC TGTCAGGACC ATTTGGAATT CTGGGAGGGG 84 0 

GTGTTCACGG GGCTCACTAA CATCGACGCT CACATGCTGT CACAGACCAA AC AGGGTGGG 900 

GAGAATTTNC CATACCTTGT AGCGTACCAA GCAACAGTCT GTGTTCGCGC GAAAGCGCCC 96 0 
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CCCCCCAGCT 
CCTACTCCCC 
ATCACCAAGT 
GTTCTGGTGG 

GCCATAGTCG 

♦ . 

TT AT AC C AG C 
i 

ACACGTGCCA 
CAGAAGGCTG 
TpGGCCACAT 



GGGACACAAT 
TCTTGTACAG 
AC ATTATGGC 
GGGGCGTTGT 
GTAGGATCAT 
AATTTGATGA 
TTG C CGGACA 
AAACTGTGAA 
ACATGTGGAA 



GTGGAAATGC 

i 

GCTGGGGCCC 
TTGCATGTCT 
GGCGGCCCTG 
CCpCTCTGGG 
GATGC^AGGAG 
A^TTCAAAGAG 
GCCdGCAGCC 
CTTCATCAGC 



256 

ATGCTCCGTC 
GTCCApANTG 
GCGGACTTGG 
GCGGCCTACT 
AAACCTGjCCA 

TGCTC^GGCCT 

I 

AAAGTGCTCG 
ACGTCTGTGT 
GGGATAC^AT 



PCT/EP94/01323 



TCAAACCGAC 
AGATCACACT 
AGGTCATTAC 
GCTTGACGGT 
TCATTCCCGA 
CGTTGCCCTA 
GCTTCATCAG 
GGAACAAGGC 
AATAG 



TTTAACTGGC 
GACGCACCCC 
CANCACTTGG 
GGGTTCGGTA 
TAGGGAGGCA 
TATGGACGAG 
CACGACCGGC 
TGAGCAGTTC 



1020. 
1080 
1140 
1200 
1260 
. *1320 

1380 

i 

1440 
1485 



(2) INFORMATION FOR SEQ ID NO: 200: 

(i) SEQUENCE CHARACTERISTICS: 

(k) 1 LENGTH 1 : 484 amino acids 
(B) TYPE: amino acid 
(O/STRANDEDNESS: singly 
(D,) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

r -i 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 200: 

Cys Ala Arg Thr lie Thr Thr Gly Ala Ser lie Thr Tyr Ser, Thr Tyr 
1 '5 10 15 

Gly Lys Phe Leu Ala Asp G1 Y G ^Y C Y S Ser Gl Y Gly Ala Tyr Asp Val 
20 25 30 

1 i 
lie lie Cys Asp Glu Cys His Ser Gin Asp Ala Thr Thr lie Leu Gly 

35 40 45 

, He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val 
50 55 60 

Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Thr Pro His Pro 
65 70 75 80 

Asn He Glu Glu Val Ala Leu Pro Gin Glu Gly Glu Val Pro Phe Tyr 
85 90 95 

Gly Arg Ala He Pro Leu Ala Phe He Lys Gly Gly Arg His Leu He 
100 105 110 

Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Lys Gin Leu Thr 
115 120 125 
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1 Ser Leu Gly Val Asn Ala' Val Ala Tyr Tyr Arg Gly Leu Asp Val Ala 

130 135 140 

Val He Pro Thr Ala Gly Asp Val Val, Val Cys Ser Thr Asp Ala Leu 
145 - 150 155 160 

Met Thr Gly Phe Thr Gly Asp Phe Asp Ser Val lie Asp Cys Asn Ser 
165 170 i 175 

\ 

i ' Ala Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He 

' '. 1 ■ ' 180 185 190 

Glu Thr »Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg 
I 195 200 205 

» 

Gly Arg Thr Gly Arg Gly Arg His Gly He Tyr Arg. Tyr Val Ser Ala 

210 1 2lt 220 

■ . S . i 

Gly Glu Arg Xaa Ser Asp Met Phe Asp Ser Val Val Leu Cys Glu Cys 
225 ■ 230 235 * \ 240 

VTyr Asp Ala Gly Cys Ala Trp Tyr 'Asp Leu Thr Pro Ala Glu Thr Thr 
245 , 250 , 255 

i 

Val Arg Leu Arg Ala Tyr ,Ile Asn Thr Pro Gly Leu Pro Val Cys Gin 

260 265 270 , 

i 

Asp His Leu Glu Phe Trp Glu Gly Val Phe, Thr Gly Leu Thr Asn lie 
1 275 1 280 1 285 

Asp Ala His Met Leu Ser Gin Thr Lys Gin Gly Gly Glu Asn Xaa Pro 
290 295 300 

Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Val Arg Ala Lys Ala Pro 
305 310 315 320 

Pro Pro Ser Trp Asp Thr Met Trp Lys Cys Met Leu Arg Leu Lys Pro 
325 330 335 

Thr Leu Thr Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Pro Val Gin 
. 340 345 350 

Xaa Glu He Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala Cys 
355 360 365 

Met Ser Ala Asp Leu Glu Val He Thr Xaa Thr Trp Val Leu Val Gly 
370 375 380 

Gly Val Val Ala Ala Leu Ala Ala Tyr Cys Leu Thr Val Gly Ser Val 
385 390 395 400 

Ala He Val Gly Arg He He Leu Ser Gly Lys Pro Ala He He Pro 
405 410 415 

Asp Arg Glu Ala Leu Tyr Gin Gin Phe Asp Glu Met Glu Glu Cys Ser 
420 425 430 

Ala Ser Leu Pro Tyr Met Asp Glu Thr Arg Ala lie Ala Gly Gin Phe . 
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445 



Lys Glu Lys 'Val Leu Gly Phe lie Ser 'Thr Thr Gly Gin Lys Ala Glu 



450 



455 



460 



Thr Leu Lys Pro Ala Ala Thr Ser Val Trp Asn Lys Ala Glu Gin Phe 
46fe 470 475 480 

*Trp Ala Thr Tyr ■ • • 
(2). INFORMATION FO$ SEQ ID NO: 201: , 1 

i 

(i*) SEQUENCE CHARApTER IS/TICS : ■ I 

(A) LENGTH: 340 base pairs 

(B) TYPE: nucleic acid , , 
<C) STRANDEDNESS: single 

<D) TOPOLOGY: linear 

• . » ' 

(ii) MOLECULE TYPE: cDNA « , 

. t 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO ' .' 1 



(ix) FEATURE: 

(A) , "NAME/KEY: CDS ■• 

(B) LOCATION: 2.. 340 

(ix) FEATURE : . i 

(Af NfcME/KEY: mat__peptide 

(B) LOCATION: 2.. 337 



(xi) SEQUENCE DESCRIPTION* SEQ ID NO: 201: 

C TCC ACT. GTG ACT GAG AGA GAC ATC AGG GTC GAA GAA GAA GTC TAT 

Ser Thr Val Thr Glu Arg Asp lie Arg Val Glu Glu Glu Val Tyr 

1 5 • , 10 15 



46 



CAG TGT TGT GAT CTG GAG 'CCC GAG GCC CGC AAG GTA ATA ACC GCC CTC • 
Gin Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Val lie Thr Ala Leu 
20 25 30 



94 



ACG GAG AGA CT<^ TAC GTG GGC . GGC CCT ATG TAC AAT AGC AAG GGA GAC 
Thr Glu Arg Leu Tyr Val Gly Gly Pro Met Tyr Asn Ser Lys Gly Asp 
35 40 45 



142 



CTT TGC GGG TAT CGC AGG TGC CQO GCA AGC GGC GTA TAT ACC ACC AGC 
Leu Cys Gly Tyr Arg Arg Cys Arg' Ala Ser Gly Val Tyr Thr Thr Ser 
50 55 60 



190 



TTC GGG AAC ACA CTG ACG TGC TAC CTT AAA GCC TCA GCA GCC ATC AGG 
Phe Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala lie Arg 
65 70 75 

GCT GCG GGG CTG AAG GAC TGC ACC ATG CTG GTT TGC GGT GAC GAC TTA 
Ala Ala Gly Leu Lys Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu 
80 85 90 95 



238 



286 
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GTC GTG ATC GCT GAA AGC GGT GGC GTC GAG GAG GAC AAG CGA 6CC CTC 334 
Val Val ^le Ala Glu Ser Gly J31y Val Glu Glu Asp Lys Arg Ala Leu 
100 ' . 105. lib 

GGA GCT , ' 340 

Gly Ala 



(2) INFQRMATION FOR SEQ ID NO : 202: 

• . ' I 1 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 113 amino acids 1 • 

(B) TYPE : amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULfe TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 202: 

Ser Thr Val Thr Glu ■Arg 1 Asp lie Arg Val Glu Glu Glu Val Tyr Gin 

1 • , . 5 , ' 10 19 

Cys Cys Asp Leu Glvi Pro Glu Ala Arg Lys Val lie Thr ( Al,a Leu Thr 

2p 25 30 

i 1 

i 

Glu Arg Leu T'yr'Val Gly Gly Pro Met Tyr Asn Ser Lys Gly Asp Leu. 
' . 35 40 45 

r i ; • . 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly, Val Tyr Thr Thr Ser Phe 

50 55 60 

Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala lie Arg Ala 

65 70 ,75 80 

* i 

i 

Ala Gly Leu Lys Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val 
85 90 95 

Val lie Ala Glu Ser Gly Gly Val Glu Glu Asp Lys Arg Ala Leu Gly 
100 105 110 

Ala a 



(2) INFORMATION FOR SEQ ID NO: 2 03: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 0 base pairs 
<B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 
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(ix) FEATURE: * 

(A) NAME/KEY: CDS , 

(B) LOCATION: 2.. 340 ' ' 

(ix) FEATURE: 1 

(A) NAME /KEY : , mat_peptide 

(B) LOCATION: 2.. 337 

i 

\ , (xi) • Sequence description: seq id no: 203: 1 

C TCC ACA GTG ACT GAA AGA GAC ATC AGG GTC GAG GAA GAG GTC TAC 46 

Ser Thr V^l Thr Glu Arg Asp He Arg Val Glu Glu Glu Val Tyr 
1 1 5 10 15 

CAG TGT TGT GAC CTG GAG CCT GAA, ACC CGC AAG GTA ATA TCT, GCC CTC 94 
Gin Cys Cys Asp Leu Glu Pro Glu Thr Arg Lys, Val He Ser Ala Leu 
20 25 30 

.i 

ACT QAA AGA CTC TAT GTG ,GGC GGT CCC ATG CAC AAC AGC AGG GGA GAC 142 
Thr Glu Arg Leu Tyr Val Gly Gly Pro f4et His Asn Ser Arg Gly Asp 
35 40 45 

CTA TGC GGG TAC CGT AGA TGC CGC GCG AGC GGC GTA TAC ACC ACA AGC 1 190 

Leu Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser 

50 ,55 60 1 ' 

TTC GGG t AAC ACT CTG ACG TGC TTC CTC AAG GCC -ACA GCG GCC ACC AAA 238 
Phe Gly Jlsn Thr Leu Thr Cys Phe Leu Lys Ala Thr Ala Ala Thr Lys 
65 ' 70 75 

GCC GCT GGC CTA AAG GAC TGC ACC ATG TTG GTG TGT GGT GAC GAC TTA 286 
Ala Ala Gly Leu Lys Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu 
80 85 90 • 95 

GTC GTT ATC GCC GAA AGC GAT GGT GTC GAA GAG GAC CGC CGA GCC CTC 334 
Val, Val He Ala Glu Ser Asp Gly Val Glu Glu Asp Arg Arg Ala Leu 
100 105 no 



■GGA GCT 
Gly Ala 



(2) INFORMATION FOR SEQ ID NO: 204: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 113 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 204: 

Ser Thr Val Thr Glu Arg Asp He Arg Val Glu Glu Glu Val Tyr Gin 
1 5 io 15 



340 
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i 

«l . » 

Cys Cys Asp Leu Glu Pro Glu Thr Arg Lys Val lie Ser Ala Leu Thr 

20 25 ' ' 30 1 ' ' ' 

Glu Arg Leu Tyr Val Gly Gly Pro Met His Asn Ser Arg Gly Asp Leu 
35 40 45 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val' Tyr Thr Thr ( Ser Phe 
50 ,55 60 

\ , r - ■ 

' ; Gly Asn Thr Leu Thr Cys Phe Leu Lys Ala Thr Ala Ala Thr Lys Ala 
65 70 ,75 80 

Ala Gly Leu Lys Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val 
85 90 95 

Val lie Ala' Glu Ser Asp Gly Val 'Glu Glu Asp Arg Arg Ala Leu Gly 
100. * ,( 105 ' 110 

Ala i ' , 

(2) INFORMATION FOR SEQ ID NO,: 2 05: , 

♦ 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 base pairs , . 

(B) TYPE: nucleic 'acid 

(C) STRA1&DEDNESS : single 
1 (D) TOPOLOGY: lirtear 

\ ' • i 

t * 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



' (ix) FEATURE: 

1 (A) NAME/KEY: CDS 
(B) LOCATION: 2.. 340 

(ix) FEATURE: 

(A) NAME /KEY : mat_j>eptide 

(B) LOCATION: 2. .337 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 205: 

C TCC ACG GTG ACC GAA AGG. GAT ATC AGG ACC GAG GAA GAG ATC TAC 46 
Ser Thr Val Thr Glu Arg Asp lie Arg Thr Glu Glu Glu He Tyr 
15 10 15 

CAG TGC TGC GAC CTG GAG CCC GAA GCC CGC AAG GTG ATA TCC GCC CTA 94 
Gin Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Val He Ser Ala Leu 
20 25 30 

ACG GAA AGA CTC TAC GTG GGC GGT CCC ATG TAC AAC TCC AAG GGG GAC 142 
Thr Glu Arg Leu Tyr Val Gly Gly Pro Met Tyr Asn Ser Lys Gly Asp 
35 40 45 
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1 

CTA TGC GGG CAA CGG AGG TGC CGC GCA AGC GGG GTC TAC ACC ,ACC AGC 190 
Leu Cys Gly Gin Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser 

50 ' 55 60 , - : ( 

TTC GGG AAC ACT GTA ACG TGT TAT CTC AAG GCC GTT, GCG GCT ACT AGG 23 8 

PheGly* Asn Thr Val Thr Cys Tyr Leu Lys Ala Val Ala Ala Thr Arg 

65 . 70 j 75 

» ' 

GCQ GCA GGT CTG AAA .GGT TGC AGC ATG CTJG GTT TGT GGA GAC GAC TTA ( 266 

Ala Ala Gly ( Leu Lys Gly Cys Ser Met Leu Val Cys Gly Asp Asp Leu 

80 ' , 85, . ■ • i 90 9 ? 



GTC GTC ATC TGC GAG AGC GGC GGC GTA GAG GAG GAT GCA AGA GCC CTC 1 334 

Val Val lie Cys Glu Ser aiy Gly Val Glu Glu Asp Ala Arg Ala Leu 

100 ' 105 110 

* i ■ 

CGA GCC \ ... • 340 

Arg Ala • • 

. » • 

(2) INFORMATION FOR SEQ ID NO: 2 06: 

(i) SEQUENCE CHARACTERISTICS: ' 1 

(A) , ' LENGTH : 113 amino acids 

(B) TYPE: amino acid , i 
(D) TOPOLOGY: linear 1 

(ii) MOllECtTLE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 206: 

Ser Thr Val Thr Glu Arg Asp lie Arg Thr Glu Glu Glu lie Tyr Gin 

1 ' 5 10 15 , 

i 

Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Val lie Ser Ala Leu Thr 

20 - . 25 3 0 

Glu Arg Leu Tyr Val Gly Gly Pro Met Tyr Asn Ser Lys Gly Asp Leu, 
35 40 45 

Cys Gly Gin Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser Phe 
'SO 55 60 

Gly Asn Thr Val Thr Cys Tyr Leu Lys Ala Val Ala Ala Thr Arg Ala 
65 70 75 80 

Ala Gly Leu Lys Gly Cys Ser Met Leu Val Cys Gly Asp Asp Leu Val 
85 90 95 

Val lie Cys Glu Ser Gly Gly Val Glu Glu Asp Ala Arg Ala Leu Arg 
100 105 110 

Ala 

(2) INFORMATION FOR SEQ ID NO: 207: 
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(i) SEQUENCE CHARACTERISTICS: ' 
(A) LENGTH: 340 base pairs ■ 1 
, ( (B) TYPE: nucleic acid 1 

(C) STRANDEDNESS : 'single ' • 

(D) TOPOLOGY: linear 

t 

(ii) MOLECULE TYPE: CDNA 

' ,1 ' . " . 

(iii) HYPOTHETICAL: NO . 

(iii) ANTI-SENSE: NO 

(ix) FEATURE: , • 

(A) NAME/KEY: CDS 
<B) LOCATION: 2.. 340 

(ix) FEATURE 1 : 

(A) NAME/KEY: 'ma t_peptide 

(B) LOCATION: rf.'. 337 • 



(xi) SEQUENCE DESCRIPTION: 'SEQ ID NO: 207: t 

C TCC ACG GTG £CT GAA AGG GAC ATT AGG GTC GAG GAA GAG ATC TAC 46 
Ser Thr Val 'Thr Glu Arg Asp lie *Arg Val Glu Glu, Glu lie Tyr 

1 * 1 ' 5 10 15 . ' ' 

CAG TGC TGT 'C^AC j CTG GAG CCC GAG GCA CGC AAG GTG ATA TCC GCT CTC ,94 
Gin Cys Cys Asp Leu Glu Pro Glu Ala Arg, Lys Val lie Ser Ala Leu 
20 25 30 

ACA GAA AGA CTC TAC AAG GGC GGC CCC ATG TAT AAC AGC AAG GGG GAC 142 
Thr Glu Arg Leu Tyr Lys Gly Gly Pro Met Tyr Asn Ser Lys Gly Asp 
35 , 40 45 

CTA TGC GGG CTT CGG AGG TGC CGC GCA AGC GGG GTA TAC ACC ACA AGC 190 
Leu Cys Gly Leu Arg Arg Cys' Arg Ala Ser Gly Val Tyr Thr Thr Ser 

50 55 60 

• i 

TTC GGG AAC ACG GTG ACA TGC TAC CTT AAA GCC ACA GCA GCC ACC AGG 23 8 

Phe Gly Asn Thr Val Thr Cys Tyr Leu Lys Ala Thr Ala Ala Thr Arg 
65 70 75 

GCT GCA GGG CTG AAA GAT . TGC ACT ATG CTG GTA TGC GGT GAC GAC TTA 286 
Ala Ala Gly Leu Lys Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu 
80 85 90 95 

\ 

GTC GTT ATT GCC GAA AGC GGT GGC ' GTG GAG GAG GAC GCC CGA GCC CTC 334 
Val Val He Ala Glu Ser Gly Gly Val Glu Glu Asp Ala Arg Ala Leu 
100 105 110 

CGA GCC 34 0 

Arg Ala 



(2) INFORMATION FOR SEQ ID NO: 208: 



BNSDOCID: <WO 9425601 A2_l_> 



SUBSTITUTE SHEET (RULE 26) 



WO 94/25601 PCT/EP94/01323 

264 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 113 amino acids 1 . • • , 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE i iprotein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 208: 



.Ser Thr Val Thr Glu Arg Asp lie Arg Val Glu Glu Glu lie Tyr Gin 
1,1 • 1 5 10 15 

i 

Cys Cys Asp Leu Glu Pro Glu Ala Arg ♦ Lys Val lie Ser Ala Leu Thr 
,20 25 ' 30 

Glu Arg Leu Tyr Lys Gly Gly Pro Met Tyr Asn Ser Lys Gly Asp Leii 

35. ", 40, 45 ' , 

Cys Gly Leu Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser Phe 

50 ' ( 1 55 , 60 

Gly Asn Thr Val Thr Cys Tyr Leu Lys T^la Thr Ala Ala Thr Arg Ala 
65 ' 70 75 80 

Ala Gly Leu Lys Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val 
85 90 95 

i 

» 

Val lie Ala Glu Ser Gly Gly Val Glu Glu Asp Ala Arg Ala Leu Arg 
1 , 100 , 105 , ■ 110 

. \ ' ' , . 

Ala 



(2) INFORMATION FOR SEQ ID NO: 2 09: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 base pairs 

(B) TYPE: nucleic acid 

. (C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iii) ANTI- SENSE : NO 



(ix) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..34 0 



(xi) SEQUENCE DESCRIPTION: "SEQ ID NO: 209: 
CCCCACCGTG ACNGAGAGGG ACNTCAGGGT CGAGGAAGAG GTCTATCAGT GCTGTAATCT 60 
GGAGNCCGAT GNCCGCAAGG TCATCAACGC CCTCACAGAG AGACT CTACG TGGGCGGCCC 120 
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i 

l 

TATGCACAAC AGCAAGGGAG ACCTGTGTGG CATCCGTAGA TGCCGCGCGA GCGGCGTTTA 180 

• i 1 

N M . 

CACCACGAGC TTCGGAAACA CGCTGACTTG CTACCTCAAA GCCACAGCGG CCACCAGGGC • ' 24 0 

CGCGGGCJTTG AAGGATTGCA CCATGCTGGT CTGCGGNGAC GACCTGGTTG TCATTGCTGA 3 00 

GAGCATTGGC ATAGACGAGG ACAaIgCAAGC CCTQCGNACT 1 340 

(2)' INFORMATION F6R SEQ ID JJO : 210: i . * 

(i) SEQUENCE CHARACTERISTICS: 1 , 

(A) LENGTH: 113 amino acids , 

(B) TYPE : amino ,acid 1 • 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECUfiE TYPE: f cD(NA 

(iii) HYPOTHETICAL: n6 
(iii) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 210: 



i 
i 



Pro Thr Val Thr Glu Arg Asp kaa Arg Val Glu jGlu Glu Val Tyr Gin 

1 1 ' 5 10 , 15 ' 

Cys Cys |( Asn Leu Glu Xaa' Asp Xaa Arg Lys Val lie Asn Ala Leu Thr 

1 ' 20 2§ .30 

Glu Arg Leu Tyr Val Gly Gly Pro Met His Asn Ser Lys Gly Asp Leu 
35 40 45 

Cys Gly lie, Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr* Ser Phe 
50 55 60 

Gly Asn Thr Leu Thr Cys' Tyr Leu Lys Ala Thr Ala Ala Thr Arg Ala 
65 .70 75 80 

Ala Gly Leu Lys Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val 
85 90 95 

Val lie Ala Glu Ser lie Gly lie Asp Glu Asp Lys Gin Ala Leu Arg 
100 105 HO 

Thr 



(2) INFORMATION FOR SEQ ID NO: 211: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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ECT/EP94/01323 



(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



1 



(ix) FEATURE: ■ 1 . . 

(A) NAME /KEY : CDS 
<B) LOCATION: 1..340 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 211: ' 
CTCGACTGTG NpCGAGAGGG ACATCAGGAC AGAGGGAGAG GTCTATCAGT GTTGCGACCT 60 

t 

GGAACCGGAA GCCCGCAAGG TAATCACCGC CCTCACTGAG AGACTCTATG TGGGCGGACC 120 

I . 
CATGTTCAAC AGCAAGGGAG ACCTGTGCQG ACAACGCCGG .TGCpGCGCAA GCGGCGTGTT 180 

CACCACCAGC ,TTCGGGAACA tACTGACGTG CTACCTTAAA GCCACAGCTG CTACTAGAGC 240 

AGCCGGCTTk AAAGATTGCA C CATGCTGGT CTGQGGTGAC GACTTAGTCG JTTiTTTCCGA 300 

GAGCGCCGGT GTGGAGGAGG ATCCCAlifAAC CCNNCGACCN ' 34 0 



(2) INFORMATION FOR SEQ ID NO : 212: 

(il) i SEQUENCE CHARACTERISTICS : 

\ (A) LENGTH: '113 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) -ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 212: 

Ser Thr Val Xaa Glu Arg Asp lie Arg Thr Glu Gly Glu Val Tyr Gin 
1 5 10 15 

Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Val lie Thr Ala Leu Thr 
20 25 30 

Glu Arg Leu Tyr Val Gly Gly Pro Met Phe Asn Ser Lys Gly Asp Leu 
35 40 45 

Cys Gly Gin Arg Arg Cys Arg Ala Ser Gly Val Phe Thr Thr Ser Phe 
50 55 60 

Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Thr Ala Ala Thr Arg Ala 
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70 



75 



PCT/EP94/01323 



• 80 



Ala Gly Leu Lys Asp Cys Thr Met Leu Va'l Cys Gly Asp Asp Leu Val 1 
85 • , 90 95 1 

Val lie Ser Glu Ser Ala Gly Val Glu Glu Asp Pro Xaa Thr Xaa Arg 



100 



105 



110 



Pro 



(2) INFORMATION FOR SEQ ID NO: 213: 

(i) SEQUENCE CHARACTERISTICS: 
1 (A) LENGTH: 340 base pairs 

<B) TYPE: nucleic acid 
CO STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 

(iii) kVPOTHETICAL : NO 

(iii) ANTI- SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY : CDS 

1 (B) LOCATION: 2.J340 

. \ 1 
» 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 2.. 337 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 213: 

C TCA ACA GTC ACC GAG AAC GAC ATC CGT GTT GAG GAG TCA ATT TAC 46 
Ser Tlxr Val Thr Glu Asn Asp lie Arg Val Glu Glu Ser lie Tyr 
15 10 15 

CAA TGT TGT GAC TTG GCC CCC GAG GCC AGA CAG GCC ATA AAG TCG CTC 94 
Gin Cys Cys Asp Leu Ala Pro Glu Ala Arg Gin Ala lie Lys Ser Leu 
20 25 30 

ACA GAG CGG CTT TAT ATC GGG GGT CCC CTG ACT AAT TCA AAG GGG CAG 142 
Thr Glu Arg Leu Tyr lie Gly Gly Pro Leu Thr Asn Ser Lys Gly Gin 
35 40 45 

AAC TGT GGC TAT CGC CGA TGC CGC GCA AGC GGC GTG CTG ACG ACC AGC 190 
Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser 
50 55 60 

TGC GGT AAT ACC CTT ACA TGT TAC'CTA AAG GCC TCT GCA GCC TGT CGA 23 8 

Cys Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala Cys Arg 
65 70 75 

GCT GCG AAG CTC CAG GAC TGC ACG ATG CTC GTG TGC GGG GAC GAC CTT 2 86 
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Ala Ala Ly3 Leu Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu , 

80 85 ' 90 95 

i 

i ' 

GTC GTT»,ATC TGT GAA AGC GCG M GGA ACC CAA GAG GAC GCG GCG AGC CTA 334 
Val Val lie Cys Glu Ser, Ala Gly Thr Gin Glu Asp Ala Ala Ser Leii 
100 • 105' , 110 

t 

CGA GTC - 340 

Arg Val ' * 1 ' 



(2) INFORMATION FOR SEQ 1 ID NO: 214: ' 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 113 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

i 

(ii) MOLECULE TYPE: protein '■ . • • 

(xi) SEQUENCE DESCRIPTION: SEQ ID, NO: 214: 

Ser Thr Val ' Thr Glu Asn Asp lie Arg Val Glu Glu Ser lie Tyr Gin 
1 5 10 , , 15 

Cys Cys Asp Leu Ala Pro Glu Ala Arg Gin Ala lie Lys Ser Leu Thr 

'20 25 ■ , 30 

Glu Arg Leu 1 JTyr, lie Gly Gly Pro Leu Thr Asn Ser Lys Gly Gin Asn 

35 40 . 45 . 1 

Cys Gly Tyr" Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys 
50 55 60 

Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala Cys Arg Ala 
65 70 75 80 

Ala Lys Leu Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val 
85 ; 90 95 

Val lie Cys Glu Ser Ala Gly Thr Gin Glu Asp Ala Ala Ser Leu Arg 
100 105 110 

Val ' 



(2) INFORMATION FOR SEQ ID NO: ; 215 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 340 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(iii) HYPOTHETICAL: NO 1 ' 

(iii),. ANTI- SENSE: NO 1 ' 

' ' i 

iix), FEATURE: 1 

(A) NAl^E/KEY : CDS 

•(B) LOCATION: 2.1340 , » ■' 

i 

■ ■ i 

♦ (ix) FEATURE : . < * ,. . i i 

, (A> NAME/KEY: mat_peptide 

(B) LOCATION: ( 2 . .34 0 



i 



(xi) SEQUENCE .DESCRIPTION: SEQ ID NO : 215: 

C TCA ACC GTC ACG GAG AGG GAT ATA AGA ACA' GAA GAA TCC ATA TAT 46 
Ser Thr Val Thr Glu Arg Asp' lie Arg Thr Glu Glu Ser lie Tyx 
1 5 , .10 15 , 

'.»■. 

CAA GCT TGT TCC CTG CCC CAA GAG GCC AGA ACT GTC ATA CAC TCG CTC 94 
Gin Ala Cys Ser Leu .Prta 1 Glh Glu Ala Arg Thr Val He His Ser Leu 
■ , 20 ' 25 30, 

ACC GAG AGA CT£ TAC GTG GGA GGG CCC ATG ATA AAC AGC AAA GGG CAA 142 
Thr Glu Arg Le*u Tyr Val Gly Gly Pro Met He Asn Ser Lys Gly Gin 

»35 ' 40 - ( 45 ' ' 

TCC TGC GGT T^C AGG CGT TGC CGC GCA AGC GGT GTT TTC ACC ACC AGC 190 
Ser Cys' Gly Tyr Arg Arg Cys Arg Ala Ser, Gly Val Phe Thr Thr Ser 
50 55 60 

ATG GGG AAT ACC ATG ACG TGT TAC ATC AAA GCC CTT GCA GCG TGT AAA 238 
Met Gly Asn Thr Met Thr Cys T^r lie Lys Ala Leu Ala Ala Cys Lys 
65 , 70 ' 75 

GCC GCA GGG ATC GTG GAC CCC GTC ATG CTG GTG TGT GGA GAC GAC CTG 2 86 

Ala Ala Gly lie Val Asp Prd' Val Met Leu Val Cys Gly Asp Asp Leu 

80 ,85 90 95 

• i 

GTC GTC ATC TCG GAG AGC CAG GGT AAC GAG GAG GAC GAG CGA AAC CTG 334 
Val Val He Ser Glu Ser' Gin Gly Asn Glu Glu Asp Glu Arg Asn Leu 
100 105 110 

AGA GCT 340 
Arg Ala 



(2) INFORMATION FOR SEQ ID NO: 216: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 113 amino acids 

(B) TYPE: amino acid 
( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 
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i (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 216: 

Ser Thr Val Thr Glu Arg Asp He Arg Thr Glu GlU Ser life Tyr- Gin ■ 
1 5 ,10 15 

Ala Cys Ser Leu Pro Gln'Glu Ala Arg Thr Val lie His Ser Leu Thr 
20 1 i 25 30 

Glu Arg Leu Tyr Val Gly ,Gly Pro Met lie Asn Set Lys Gly Gin Ser 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Phe Thr Thr Ser Met 
50 , 55 60 

Gly Asn Thr Met Thr Cys Tyr He Lys Ala Leu Ala Ala Cys Lys Ala 
65 70 75 " 86 

Ala Gly He Val Asp Pro Val Met Leu Val Cys . Gly, Asp Asp Leu Val 

85 90 95 

\ 

Val He Ser Glu Ser Gin Gly Asn Glu Glu Asp Glu Arg Asn Leu Arg' 
1 ' 100 105 , 110, 

Ala 

(2) INFORMATION FOR SEQ ID NO: 217: 1 

i 

(£)i SEQUENCE CHARACTERISTICS: ,* ■ 

\ (A) LENGTH: * 34 0 base pairs , 
' (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA ■ ■ . 

(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 2.. 340 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 2.. 340 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 217: 

C TCG ACT GTC ACT GAA CAG GAC ATC AGG GTG GAA GAG GAG ATA TAT 46 
Ser Thr Val Thr Glu Gin Asp He Arg Val Glu Glu Glu He Tyr 
1 5 10 15 

CAA TGC TGC AAC CTT GAA CCG GAG GCC AGG AAA GTG ATC TCC TCC CTC 94 
Gin Cys Cys Asn Leu Glu Pro Glu Ala Arg Lys Val lie Ser Ser Leu 
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20 25 3t> • 

ACG- GAG CGG CTT TAC TGC GGA GGC CCT ATG TTT AAC AGC AAG GGls ' GCC '142 
Thr Glu Arg Leu Tyr Cys Gly Gly Pro Met ( Phe Asn Ser Lys Gly; Ala 
35 , 40 45 

' i 

CAG TGT GGT TAT CGC CGT TGC CGT GCC AGT GGA GTT CTG CCT ACC AGC 190 
Gin Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Pro Thr Ser 
50 i 55 60 

\ TTT GGC AAC ACA ATC ACT TGT TAC ATC AAG GCC ACA ACG GCC GCG AAG ( 238 

Phe Gly Asn Thr lie Thr Cys Tyr lie Lys Ala Thr Thr Ala Ala Lys 
65 70 ' 75 

I 

GCC GCA GGC CTC CGG AAC CCG GAC TTT CTT GTC TGC GGA GAT GAT CTG 2 86 

Ala Ala Gly Leu Arg Asn Pro Asp Phe Leu Val Cys Gly Asp Asp Leu 
80 '85 1 90 '95 

GTC GTG GTG GCT GAG AGT GAT GGd GTC GAC GAG GAT 1 AGA GCA GCC CTG 3 34 

Val Val' Val Ala Glu Ser Asp Gly Val Asp Glu Asp Arg Ala Ala Leu 
• 100 105 ' , 110 

AGA GCC ' ' 1 ' 340 

Arg Ala , , 



(2) INFORMATION FOR SEQ ID tiO: 218: 

'(i) SEQUENCE CHARACTERISTICS : 

* , (A) LENGTH: 113 amino acids • 
(B) TYPE: amino acid 

(D) TOPOLOGY: linear ■ 

(ii) MOLECULE TYPE: protein , 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 218: 

Ser' Thr. Val Thr Glu Gin Asp lie Arg Val Glu Glu Glu lie Tyr Gin 
1 ' 5 10 15 

Cys Cys Asn Leu Glu Pro Glu Ala Arg Lys Val lie Ser Ser Leu Thr 
20 25 30 

Glu Arg Leu Tyr Cys Gly Gly Pro Met Phe Asn Ser Lys Gly Ala Gin 
35 40 45 

Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Pro Thr Ser Phe 
50 55 60 

Gly Asn Thr lie Thr Cys Tyr lie Lys Ala Thr Thr Ala Ala Lys Ala 
65 70 75 80 

Ala Gly Leu Arg Asn Pro Asp Phe Leu Val Cys Gly Asp Asp Leu Val 
85 90 95 

Val Val Ala Glu Ser Asp Gly Val Asp Glu Asp Arg Ala Ala Leu Arg 
100 105 110 
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Ala , , ' 

(2) INFORMATION FOR SEQ ID NO: 219: , 

' . i 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 10 amino acids , 
1 (B) TYPE: amino acid 
. (C) STRANDEDNESS t single 
(D) TOPOLOGY: linear ' 

r ■ 
(ll) MOLECULE TYPE: peptide 



(xi) SEQUENCE # DESCRIPTION: SEQ ID NO: 219: 

» Arg Ser Glu Gly Arg Thr Ser Trp Ala Gin 

1 I 5 10 

\ 

(2) INFORMATION FOR SEQ ID NO: 220:' 

( i ) SEQUENCE CHARjACTERISTICS : ' / 

(A) LENGTH: 10 amino acids 

(B) tafPE: amino acid 

(C) STRANDEDNESS : single 

(D) .tOPOLOGY: linear 

.(ii) MOLECULE TYPE: peptide 

i i 

(xi) SEQUENCE DESCRIPTION: SEQ ID NQ : 220: 

Arg Ser Glu Gly Arg Thr Ser Trp Ala Gin 
1 5 io 

(2) INFORMATION FOR SEQ ID NO: 221: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 221: 

Arg Thr Glu Gly Arg Thr Ser Trp Ala Gin 

1.5 10 



(2) INFORMATION FOR SEQ ID NO: 222: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 62 9 base pairs 



BNSDOCID: <WO 9425601 A2_l_> 



SUBSTITUTE SHEET (RULE 26} 



WO 94/25601 



PCT/EP94/01323 



273 



(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL : NO 
(iii) ANTI-SENSE: NO i 



(ix) FEATURE: 

(A) .NAME/KEY: CDS 

( B ) I LOCAT ION : 3. .629 

(ix) FEATURE: 

{A) NAME/KEY: mat_peptide 

(B) LOCATION: 3.. 629 '» 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 222: 



TA GAC TTT TGG GAG AGC GTC TTC, ACT GGA CTA ACT ,CAC ATA GAT GCC 
Asp Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His lie Asp Ala 
1 , ■ 5 10 15 



47 



CAC TTT CTG TCA CAG ACT AAG CAG CAG GGA CTC AAC TTC TCG TTC CTG 
His Phe Leu Ser Gin 'Thr Lys Gin Gin Gly Leu Asn Phe Ser Phe Leu 

' 1 ■ 20 1 25 1 30 



95 



ACT GCC TAC CAA GCC ACT GTG TGC GCT CGC GCG CAG GCT CCT CCC CCA 
Thr Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro 
35 40 45 



143 



AGT TGG GAC GAG ATG TGG AAG TGT CTC GTA CGG CTT AAG CCA ACA CTA 
Ser Trp Asp Glu Met Trp Lys Cys Leu Val Arg Leu Lys Pro Thr Leu 
50 55 60 



191 



CAT GGA CCT ACG CCT CTT CTA TAT CGG TTG GGG CCT GTC CAA AAT GAA 
His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Pro Val Gin Asn Glu 
65 70 75 



239 



ATC TGC TTG ACA CAC CCC ATC ACA AAA TAC ATC ATG GCA TGC ATG TCA 
lie Cys Leu Thr His Pro lie Thr Lys Tyr lie Met Ala Cys Met Ser 
80 85 90 95 



287 



GCT GAT CTG GAA GTA ACC ACC AGC ACC TGG GTT TTG CTT GGA GGG GTC 
Ala Asp Leu Glu Val Thr Thr Ser Thr Trp Val Leu Leu Gly Gly Val 
100 105 110 



335 



CTC GCG GCC CTA GCG GCC TAC TGC TTG TCA GTC GGT TGT GTT GTG ATT 
Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Val Gly Cys Val Val lie 
115 120 125 



383 



GTG GGT' CAT ATC GAG CTG GGG GGC AAG CCG GCA ATC GTT CCA GAC AAA 
Val Gly His lie Glu Leu Gly Gly Lys Pro Ala lie Val Pro Asp Lys 
130 135 140 



431 
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GAG GTG TTG TAT CAA CAA TAC GAT GAG ATG GAA GAG TGC TCA CAA GGT , ' 4 79 
Glu Val Leu Tyr Gin Gin Tyr Asp Glu Met Glu Glu Cys Ser Gin Ala 
145 150 , 155 . ' 

'• • ; m 

GCC CCA TAT ATC GAA CAA GCT CAG GTA ATA GCT CAC CAG TTC AAG* QAA ' 527 

Ala Pro Tyr lie Glu Gin Ala Gin Val lie Ala His Gin Phe Lys Glu 
160 ' • 165 170 ' 175 

. i •. 

AAA GTC CTT GGA TTG CTG C^G.CGA GCC ACC C&A CAA CAA GCT GTC ATT 575 
Lys yal Leu Gly Leu Leu Gin Arg Ala Thr Gin Gin Gin Ala, Val lie 

180* ' 185' 190 , 1 

i 1 

1 I 

GAG CCC ATA GTA ACT ACC AAC TGG CAA AAG CTT GAG GCC TTT TGG CAC ' 
Glu Pro He Val Thr Thr Asn Trp Gin Lys Leu Glu Ala Phe Trp His 
195 ' 200 1 205 



623 



A*? CAT , . 629 

Lys His 



(2) INFORMATION FOR SE(j> ^ ID ,NO : 223: . . 

(i) SEQUENCE CHARACTERISTICS: » 

(A) LENGTH: 209 amino acids , , 

(B) TYPE: amino acid " ( 

(D) ^TOPOLQGY: linear ' . 

(ii) MOLECULE TYPE: protein 

' f i 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 223: 

Asp Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His lie Asp Ala His 
1 5 10 15 

Phe Leu Ser Gin Thr Lys Gin Gin Gly Leu Asn Phe Ser Phe Leu Thr 
20 25 30 

Ala Tyr Gin Ala Thr Val Cys^Aia Arg Ala Gin Ala Pro Pro Pro Ser 
35 > . , 40 45 

Trp Asp Glu Met Trp Lys Cys Leu Val Arg Leu Lys Pro Thr Leu His 
50 55 60 

Gly Pro Thr Pro' Leu Leu Tyr Arg Leu Gly Pro Val Gin Asn Glu lie 
65 70 75 8 0 

Cys Leu Thr His Pro lie Thr Lys Tyr He Met Ala Cys Met Ser Ala 

85. ■ " * 90 95 

Asp Leu Glu Val Thr Thr Ser Thr Trp Val Leu Leu Gly Gly Val Leu 
100 105 ' Ho 

Ala Ala Leu Ala Ala Tyr Cys Leu Ser Val Gly Cys Val Val He Val 
115 120 125 

Gly His He Glu Leu Gly Gly Lys Pro Ala He Val Pro Asp Lys Glu 
130 135 140 
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Val Leu Tyr Gin Gin Tyr Asp Glu Met Glu Glu Cys Ser Gin Ala Ala 

145 l( . " 150 ' M '155 160 

Pro Tyr lie Glu Gin Ala Gin Val lie Ala His Gin Phe Lys Glu Lys 
, 165 170 ' 175 

Val Leu Gly Leu Leu Gin Arg* Ala Thr Gl,n GiLn Gin Ala Val He Glu 
180 ' 185 190 

Pro He yal Thr Thr Asn Trp Gin Lys Leu Glu Ala Phe Trp His Lys 
195 • i 200 1 205 . . 

His , . . 



(2) INFORMATION POR SEQ ID NO:' '224: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

<C> 5} TRANDEDNES S : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide ! 



(ix) FEATURE: 1 

(A) NAME/KEY: Peptide 

(B) LOCATION: 2 . . 12 



(xi) SEQUENCE DESCRIPTION: ' SEQ ID NO: 224: 

i 

He His Tyr Arg Asn Ala Ser Gly He Tyr His He 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 225: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) S TRANDEDNES S : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 225: 

Val Asn Tyr Arg Asn Ala Ser Gly He Tyr His He 

15 1C 

(2) INFORMATION FOR SEQ ID NO: 5: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear , 

(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE 'DESCRIPTION: ' SEQ ID NO: 226: 
Val Asn Tyr Arg Asn Ala Ser Gly Val Tyr His He 



10 



'(2) INFORMATION FOR SEQ ID NO: 227: 

(i) SEQUENCE 'CHARACTERISTICS: 

(A) LENGTH : 12 amino *i acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 
i i 
(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 227: 

Val Asn Tyr His Asn Thr Ser Gly He Tyr His Leu 

i \ '5 " io « , 

(2) INFORMATION FOR SEQ ID NO: 228: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 228: 

Gin His Tyr Arg Asn Ala Ser Gly He Tyr His Val 
1 5 io 

(2) INFORMATION FOR SEQ ID NO : 229: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12' amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 229: 

Gin His Tyr Arg Asn Val Ser Gly lie Tyr His Val 
1 5, ( 10 

(2) INFORMATION FOR SEQ ID NO: 23 0: 

(i) SEQUENCE CHARACTERISTICS : 
• ' (A) LENGTH: 12 amino acids 

(B) TYPE: amino acid . . 

(C> STRANDEDNESS: single 
(D) TOPOLOGY: linear 
' (ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 230: 

t • 
He His Tyr Arg Asn ' Ala Ser Asp Gly Tyr Tyr. He 
,1 5 • .10 

(2) INFORMATION FOR SEQ ID NO: 231: 

(i) SEQUENCE CHARACTERISTICS: 

' (A) LENGTH: 12 amino acids 

(B) TYPE: amino acid ( 
1 (C) STRANDEDNESS: 1 single 1 
S ,(D) TOPOLOGY: linear 1 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 231: 

' Leu Gin Val Lys Asn Thr Ser Ser Ser Tyr Met Val 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 232: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 232: 

Val Trp Gin Leu Arg Ala He Val Leu His Val 
15 10 

(2) INFORMATION FOR SEQ ID NO: 233: 



BNSDOCID: <WO 9425601 A2J_> 



SUBSTITUTE SHEET (RULE 26) 



WO 94/25601 PCT/EP94/01323 

278 ' 

I 

I 

(i) SEQUENCE CHARACTERISTICS : , 

(A) LENGTH: 11 amino acids • ' 

(B) TYPE: amino acid , 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear ~ , 

(ii). MOLECULE TYPE: peptide 

1 i 
(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 233: 

Val Tyr Glu Ala Asp Tfyr His lie Leu His Leu, 1 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 234: » 

(i) SEQUENCE CHARAbTERISTICS : 

(A) LENGTH: 11' amino acids ' 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single ' > 

(P) TOPOLOGY: linear," 

■ • i 

(ii) MOLECULE TYPE: peptide ■ ' 

• (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 234: 
* , •« 
Val Tyr 1 Glu Thr Asp Asn His lie Leu His * Leu 
1 5 ' 10 

(2) INFORMATION FOR SEQ ID NO: 23 5: 

(i) SEQUENCE CHARACTERISTICS: 
. (A) LENGTH: 11 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS-: .single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 235: 

Val Tyr Glu Thr Glu Asn His lie Leu His Leu 

1 5 . • 10 

(.2) INFORMATION FOR SEQ ID NO: 236: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid - 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(xi)' SEQUENCE DESCRIPTION: SEQ ID NO: 236: 

, Va} Phe Glu Thr Val His His lie Leu His Leu 1 
1 , 5 10 

(2) INFORMATION FOR SEQ ID NO: 237.: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 1 

(B) TYPE: amino acid 

(C) STRANDEDNES S, : single 

(D) TOPOLOGY: linear 

• (ii) MOLECULE TYPE: peptide ' 

\ 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 237: 

. • * * i 

Val Phe qiu Thr Glu His His lie Leu His Leu 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 23^: 

(i) SEQUENCE CHARACTERISTICS: 

(A) J( LENGTH: 11 amino acids ' 

(B) TYPE: amino acid 

(C) STRANDEDNES S : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide, 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 238: 

Val Phe Glu Thr Asp His His lie Met His Leu 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 23 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear \ 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: "SEQ ID NO: 239: 

Val Tyr Glu Thr Glu Asn His lie Leu His Leu 
1 5 10 
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(2) INFORMATION FOR SEQ ID NO: 240: 

(i) SEQUENCE CHARACTERISTICS: . 

(A) LENGTH: 11. amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY': 1 linear 

(ii) MOLECULE TYPE: ^peptide 



PCT/EP94/01323 



(xi) SEQUENCE DESCRIPTION: SEQ iD NO: 240: 

) 

• Val Tyr Glu Ala Asp Ala Leu lie Leu His Ala 

1 5 io 

1 i i 

(2) INFORMATION FOR SEQ ID NO: 'i 241: " , 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13. amino acids 
1 <B) TYPE: amino acid , 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : peptide 



(xi) t SEQUENCE DESCRIPTION: SEQ ID NO: 241 i 

Val Gin Asp Gly Asn Thr Ser Ala Cys Trp Thr Pro Val 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 242: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 242: 
Val Lys Thr Gly Asn Gin Ser Arg Cys Trp Val Ala Leu 
1 5 io 

(2) INFORMATION FOR SEQ ID NO: 243: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino -acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: .SEQ ID NO: 243: 
i 

Val Lys Thr Gly Asn Gin Ser Arg Cys Trp Val Ala Leu 
1 5 1 , |10 , 



(2) • INFORMATION FOR SEQ ID IjTO.: 244 : 

(i) SEQUENCE' CHARAQTERIstlCS : 

(A) LENGTH: 13 amino acids 

(B) TYE*E : amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

» 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NOc 244: , ' 



Val Arg Thr Gly Asn Gin Ser Arg Cys Trp Val Ala Leu 
1 - 5 10 

(2) INFORMATION ' FOR SEQ ID NO: 245: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE, TYPE: peptide ' 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 245: 

Val Lys Thr Gly Asn Gin Ser Arg Cys Trp He Ala Leu 
15 10 

(2) INFORMATION FOR SEQ ID NO: 246: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid ; 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) . MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 246: 
Val Lys Thr Gly Asn Gin Ser Arg Cys Trp He Ala Leu 
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, 1 5 io 

(2) INFORMATION FOR SEQ ID NO: 247: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

\ , (ii) 'MbLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 247: 

Val Lys Thr Gly Asn Ser Va} Arg Cys Trp lie Pro Le\* 
1 5 I. 10 

(2) INFORMATION FOR SEQ ID NO: 248: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids ' 1 

(B) TYPE: amino acid . ' ■ 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE JYPE : peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 248: 

. Val Lys Thr Gly Asn Val Ser Arg Cys Trp lie Ser Leu 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 249: 

(ii SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE.: peptide 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 249: 

Val Arg Lys Asp Asn Val Ser Arg Cys Trp Val Gin lie 
1 5 ~ io 

(2) INFORMATION FOR SEQ ID NO: 250: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide' 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 250: 



Ala Pro Ser Phe Gly Ala Val Thr Ala Pro 
; 1 .- ' .5 10 

(2) INFORMATION FOR SEQ ID NO: 251: 



(i) 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(fc) STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(ii) 



MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 251: 
Val Ser Gin Pro Gly Ala* Leu Thr Lys Gly 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(i,i) MOLECULE TYPE : peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 252: 

i 

Val Lys Tyr Val Gly Ala Thr Thr Ala Ser 
1 ' 5 10 

(2) INFORMATION FOR SEQ ID NO: 253: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 253: 



1 



5 



10 



\ (2) INFORMATION FOR SEQ ID 'NO : 252: 

\ 1 



\ 
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Ala Pro Tyr lie Gly Ala Pro Val Glu Ser 

1 .. 5 ■ 10 



(2) INFORMATION FOR SEQ ID NO: 254: 

U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

♦ (C) STRANDEDNESS: single 

(D) TOPOLOGY: liAear 



(ii) MOLECULE TYPE: peptide 



(xi) 1 SEQUENCE DESCRIPTION: SEQ ID NO: 254 

Ala Gin His Leu Asin Ala Pro Leu Glu Ser 
1 5 ' ■ ' 10 

(2) INFORMATION FOR SEQ ID NO: 255: ' * 
(i) SEQUENCE CHARACTERISTICS: 

(A) ' LENGTH : 10. amino acids 

(B) TYPE: amino acid 

(C) ' STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MQLECULE TYPE: peptide 

i I' i 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 255; 

Ser Pro Tyr Val Gly Ala Pro Leu Glu Pro 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 256: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 256: 

Ser Pro Tyr Ala Gly Ala Pro Leu Glu Pro 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 257: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 
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(B) TYPE: amine acid 

(C) STRANDEDNESS : single ■ 

(D) TOPOLOGY: linear • ' 

' * M , ( 

(ii) MOLECULE TYPE: peptide 

i 

'(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 257: 

• ■ ' 
' ' ' • r ■ . • • , ' 

Al^a Pro Tyr Leu Gly Ala Pro Leu Glu Ser 

1 5 i ■ ' 1 10 " ■ , 

(2) INFORMATION 'FOR SEQ II? NO: 258: 

(i) SEQUENCE CHARACTERISTICS: 
• (A) LENGTH: 10 amino acids ' 

(B) TYPE: amino 'acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear • 

(ii) MOLECULE TYPE*: peptide ' 



(xi) SEQUENCE DESCRIPTION: SEfc ID NO: 258; 

i 

' flla Pro Tyr Leu Gly Ala Pro Leu Glu Ser 

1 • 5 ' 10 

» r i 

(2) INFORMATION FOR SEQ ID NO: 259: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 259; 

Ala Pro Tyr Val Gly Ala Pro Leu Glu Ser 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: £60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 
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, I 

I • 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 260; 

' . • • 

Asn Val Pro Tyr Leu Gly A3,a Pro Leu Thr Ser 

1 5 ■ 10 \. 

(2) INFORMATION FOR SE& 1 ID NO: 261: 

(i) SEQUENCE CHARACTERISTICS: " 
\ ' ( (A) LENGTH: 10 amino' acids ( 

• i 1 (B) TYPE: amino acid 

(C) STRANDEDNESS: single , ' 
(DJ TOPOLOGY: linear 
i 

» (ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 261: 

Ala Pro His Leu Arg- Ala Pro Leu Ser Ser 
i 1 ' 5 . io 

(2) INFORMATION FOR SEQ ID NO: 262: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

' * (C) STRANDEDNESS £ single 
» ( (D) TOPOtOG*: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 262: 

• Ala Pro Tyr Leu Gly Ala Pro Leu Thr Ser 
1 ■ 5 io 

(2) INFORMATION FOR SEQ ID NO: 263: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 263; 

Arg Pro Arg Gin His Ala Thr Val Gin Asp 
1 5 io 

(2) INFORMATION FOR SEQ ID NO: 264: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : . 10 amino acids ' ' ' ' 

(B) TYPE : amino acid i 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : , linear 

i 

(ii) MOLECULE TYPE: peptide » 



"I 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 264: 

Ser Pro •Gin His His Lys Phe Val Gin Asp 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 2*65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 
( (B) TYPE: amino* acid 

(C) STRANDEDNESS: single 1 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) 'SEQUENCE DESCRIPTION: SEQ ID NO: 2*5: 

Arg Pro Arg Arg Leu Trp Thr Thr Gin Glu 
1 ~ 5 10 

(2) INFORMATION FOR SEQ ID NO: 266: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

' (C) STRANDEDNESS: single 
(Dj TOPOLOGY: linear 

(ii-) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 266: 

Pro Pro Arg lie His Glu Thr Thr Gin Asp 

1.5 10 

(2) INFORMATION FOR SEQ ID NO: 2 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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I 

(ii) MOLECULE TYPE: peptide , ' 

i 

! i 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 267: 

fht lie Ser Tyr Ala Asn Gly Ser Gly Pro Ser Asp Asp Lys 
1 . 5 , 10 



I 



(2) ( INFORMATION FOR ( SEQ ID NO: 268: 

' f ■ 

(i) SEQUENC? CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYjPE : amino acid 

(C) STRANDEDNES^ : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

i 



(xi) SEQUENCE DESppiPTION: SEQ ID'NO: 268; 



Ser Arg Arg Gin' Pro lie Pro Arg Ala Arg Arg Thr Glu Giy Arg Ser 
1 5 10 « ' - 15 



» 

Trp Ala, Gin 



(2) INFORMATION FOR SEQ ID NO: 269: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1443 b^se pairs 
. (B) TYPE: nucleic acid 

(C) STfeANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI- SENSE: NO 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..1443 

(ix) FEATURE: 

(A) NAME /KEY : mat_peptide 

(B) LOCATION: 1..1443 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 269: 



ACC ATC ACC ACC GGA GCT TCT ATC ACA TAC TCC ACT TAC GGC AAG TTC 4 8 

Thr lie Thr Thr Gly Ala Ser lie Thr Tyr Ser Thr Tyr Gly Lys Phe 
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i' 1 5 10 lS • 

CTT GCT GAT GGA GGG TGT TCA GGC GGC GCG TAT ' GAC ' GTG ATC ATA TGC ' 96 

Leu Ala Asp Gly Gly Cys Ser Gly -Gly Ala Tyr Asp Val lie lie Cys 
20 , 25 30 

GAC GAG TGC CAT TCC CAG &AC GCC ACC ACC ATT ( CTT GGG ATA GGC ACT 144 
Asp Glu Cys His Ser Gin Asp Ala Thr Thr lie ' Leu Gly lie iGly Thr 
35 40 45 

. q»j»C CTT GAC CAG GCA GAG ACG GCT GGA GCT AGG CTC GTC GTC TTG GCC 192 
Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 
50 . 55 60 

l 

ACG GCC ACC CCT CCC GGC AGT GTG ACA ACG CCC CAC CCC AAC ATC GAG 24 0 

i 

Thr Ala Thr Pro Pro Gly Ser Val Thr Thr Pro His Pro Asn lie Glu 
65 '70 1 75 1 80 

■ \ • 

GAA GTG • GCC CTG CCT CAGj GAG GGG GAG GTT CCC TTC TAC GGC AGA GCC 288 
Glu Val Ala* Leu Pro Gin Glu Gly Glu Val t>ro Phe Tyr G^y Arg Ala 
85 90 95 

ATT CCC CTT GCT TTT ATA AAG GQT GGT AGG CAT CT.C ATC TTC TGC CAT 336 

lie Pro Leu Ala Phe lie Lys Gly Giy Arg His Leu lie Phe Cys His 
100 105 110 

i ■ 

TCC AAG AAA AAA TGT GAT GAA 'CTC GCC AAG CAA CTG ACC AGC CTG GGC 3 84 

Ser Lys Lys Lys Cys 1 Asp Glu Leu Ala Lys Gin I^eu Thr Ser Leu Gly 
'l'lS '120 1 125 

GTG AAC GCC GTG GCA TAT TAT AGA GGT CTA GAC GTC GCC GTC ATC CCC 432 
Val Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ala Val He Pro 
130 135 140 

ACA GCA GGA GAC GTG GTC GTG TGC AGC ACC GAC GCG CTC ATG ACG GGA 480 
Thr Ala Gly Asp Val Val Val Cys Ser Thr Asp Ala Leu Met Thr Gly 
145 150 155 160 •" 

TTC ACC GGC GAC TTT GAT TCT GTC ATA GAC TGC AAC TCC GCC GTC ACT 528 
Phe Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Ser Ala Val Thr 
165 170 175 

CAG ACG GTG GAC TTC AGT CTG GAT CCC ACT TTT ACC ATT GAG ACT ACC 576 
Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr lie Glu Thr Thr 
180 185 190 

ACA GTG CCC CAG GAC GCA GTG TCC AGA AGC CAG CGT AGG GGC CGC ACG 624 
Thr Val Pro Gin Asp Ala Val Ser Arg Ser Glh Arg Arg Gly Arg Thr 
195 200 205 

GGG AGA GGT AGG CAC GGC ATA TAC CGG TAT GTC TCG GCT GGA GAG AGA 672 
Gly Arg Gly Arg His Gly lie Tyr Arg Tyr Val Ser Ala Gly Glu Arg 
210 215 220 

CCG TCT GAC ATG TTC GAC TCC GTG GTG CTC TGT GAG TGC TAC GAT GCC 72 0 

Pro Ser Asp Met Phe Asp Ser Val Val Leu Cys Glu Cys Tyr Asp Ala 
225 230 235 240 
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GGA TGT GCG TGG TAT GAT CfG ACT CCT GCC GAG ACT ACC GTG 
Gly Cys Ala Trp Tyr Asp Leu Thr Pro Ala Glu Thr Thr Val 
245 . 25Q 

i m ■ 

CGC GCT TAC ATA AAC ACC CCC GGG CTC CCT GTC TGT CAG GAC 
Arg Ala Tyr lie Asn Thr Pro Gly Leu Pro' Val Cys Gin Asp 

• 260 265 ' 270 

• - 
GAA TTC TGG GAG GGG GTG ^TC ACG GGG CTC kcT AAC ATC GAC 
Glu, Phe Trp Glu Gly yal Phe Thr Gly Leu Thr Asn lie Asp 
275 ' * 280 ' 285 

* ; • . • , . • 

ATG CTG TCA CAG ACC AAA CAG GGT GGG GAG AAT TTC CCA TAC 
Met Leu Ser Gin Thr Lys Gin Gly Gly Glu Asn Phe Pro Tyr 

290 . . . i§5 300 ' 



AGG TTG i 
^Vrg Leu 
255 

CAT. TTG 
His Leu 



GCT CAC 
Ala His 



CTT GTA 
Leu Val 



768 



816 



864 



912 



G9G TAC CAA GCA ACA GTC TGT GTT CGC GCG JjAA GCG CCC CCC 
Ala Tyr Gin Ala ,Thr Val Cys Val Arg Ala ■ Lys Ala Pro Pro 
305 310i 315 

TGG GAC ACA ATG TGG AAA TGC ATG ( CTC CGT CTC AAA CCG ACT 
Trp Asp Thr Met Trp Lys Cys Met Leu Arg Leu Lys Pro Thr 
325 , , . 330 • 



GGC CCT ACT ' C(tC CTC TTG TAC AGG 

Gly Pro Thr Pro Leu Leu Tyr Arg 
3^0 

» . 

ACA CTG ACG CAC CCC ATC ACC AAG 

Thr Leu Thr, His Pro lie Thr ,Lys 

• 355 <' .i , 360 



CTG GGG CCC GTC CAG AAT 
Leu Gly Pro Val Gin Asn 

3*5 350 

i 

TAC ATT ATG GCT TGC ATG 
Tyr IgLe Met Ala Cys Met 
• 365 



CCC AGC 
Pro Ser 
320 
TTA ACt 
Leu Thr 
335 , 

GA6 ATC 
Glu He 



TCT GCG 
Ser Ala 



960 



1008 



1056 



1104 



GAC TTG GAG GTC ATT ACC AGC ACT 
Asp Leu Glu Val He Thr Ser Thr 
370 375 



TGG GTT CTG GTG GGG GGC GTT GTG 
Trp Val Leu Val Gly Gly Val Val 
380 



1152 



GCG GCC CTG GCG GCC TAC TGC TTG ACG GTG GGT TCG GTA GCC ATA GTC 
Ala Ala Leu Ala Ala Tyr Cys Leu Thr Val Gly Ser Val Ala He Val 
385 390 395 400 



1200 



GGT . AGG ATC ATC CTC TCT, GGG AAA CCT GCC ATC ATT CCC GAT AGG GAG 
Gly Arg He He Leu Ser Gly Lys Pro Ala He He Pro Asp Arg Glu 
405 410 415 



1248 



GCA TTA TAC , CAG CAA TTT GAT GAG ATG GAG GAG TGC TCG GCC TCG TTG 
Ala Leu Tyr Gin Gin Phe Asp Glu Met Glu Glu Cys Ser Ala Ser Leu 
420 425 430 



1296 



CCC TAT ATG GAC GAG ACA CGT GCQ ATT GCC GGA CAA TTC AAA GAG AAA 
Pro Tyr Met Asp Glu Thr Arg Ala He Ala Gly Gin Phe Lys Glu Lys 
435 440 445 



1344 



GTG CTC GGC TTC ATC AGC ACG ACC GGC CAG AAG GCT GAA ACT CTG AAG 
Val Leu Gly Phe He Ser Thr Thr Gly Gin Lys Ala Glu Thr Leu Lys 
450 455 460 



1392 



CCG GCA GCC ACG TCT GTG TGG AAC AAG GCT GAG CAG TTC TGG GCC ACA 
Pro Ala Ala Thr Ser Val Trp Asn Lys Ala Glu Gin Phe Trp Ala Thr 
465 470 475 480 



1440 
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TAC * 1443 

Tyr • ■ „ 

(2) INFQRMAT I ON FOR SEQ ID NO: 270: 1 

(i) SEQUENCE CHARACTERISTICS: , I ' , 

(A) LENGTH: 481! amino acids 
• (B) TYPE j amino acid i j 

, (D) TOPOLOGY: linear 

1 ' t ' i 

(ii) MOLECULE TYPE : protein ( 
(xi) SEQUENCE. DESCRIPTION: SEQ ID NO : 270: 

Thr lie Thr Thr Gly Ala Ser lie Thr Tyr£er Thr Tyr Gly Lys Phe 

1 '5 ' '.' 10 15 

i 

• - • ■ ' * ■ 

Leu Ala Asp Gly Gly Cys Ser Gly'Gly Ala Tyr Asp Val lie lie Cys 

20 25 30 

1 1 ■ ; ■ 

Asp Glu Cys H^s Ser CJln Asp Ala Thr Thr lie Leu Gly lie Gly Thr 

35 - 40 ■ 45 ( 

Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 

50 ' 55 ^0 ' ' ' 

» * 

Thr Ala Thr 1 fro ( Pro Gly Ser Val Thr Thr Pro His Pro Asn lie Glu 

65 70 ,75 . < 80 

Glu Val Ala Leu Pro Gin Glu Gly Glu Val Pro Phe Tyr Gly Arg Ala 

• 85 90 95 

lie Pro Leu Ala Phe lie Lys Gly ( Gly Arg His Leu lie Phe Cys His 

100 , 105 110 ' . 

Ser Lys Lys Lys Cys Asp Glu Leu Ala Lys Gin Leu Thr Ser Leu Gly 
115 " 120 125 

Val . Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ala Val lie Pro ' 
130 135 140 

Thr Ala Gly Asp Val Val Val Cys Ser Thr Asp Ala Leu Met Thr Gly 
145 ' ■ 150 155 160 

Phe Thr Gly Asp Phe Asp Ser Val lie Asp Cys Asn Ser Ala Val Thr 

165 170 175 

i 

\ 

Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr lie Glu Thr Thr 
180 185 190 

Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly Arg Thr 
195 200 205 

Gly Arg" Gly Arg His Gly lie Tyr Arg Tyr Val Ser Ala Gly Glu Arg 
210 215 220 

Pro Ser Asp Met Phe Asp Ser Val Val Leu Cys Glu Cys Tyr Asp Ala 
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I 

i, 225 230 235 , 240 

Gly Cys Ala Trp Tyr Asp Leu Thr Pro Ala Glu Thr. Thr Val Arg ,Leu 
245 250 ' 255 

Arg Ala Tyr lie Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu 
260 ' , 265 270 

Glu Phe Trp Glu Gly Val Phe Thr Gly Leu Thr Asn lie Asp 'Ala His 
. . 275 280, 285 

' Met Leu Ser Gin Thr Lys Gin Gly Gly Glu Asn Phe Pro Tyr Leu Val 
290 295 • ' 300 

^la Tyr Gin Ala Thr Val Cys Val Arg Ala Lys Ala Pro Pro Pro Ser 
305 310 315 320 

Trp Asp Thr Met Trp Lys Cys tyeti, Leu Arg Leu Lys t Pro Thr Leu Thr 

325 330 335 

i 

i ' . 

Gly Pro Thr Pro Leu Leu TfYr Arg Leu Gly Pro Val Gin Asn Glu lie 

• ■ 340 345 , 350 

1 i 

Thr Leu Thr His Pro lie Thr Lys Tyr lie Met Ala Cys Met Ser Ala 

355 360 365 • 

Asp Leu Glu Val lie Thr Ser , Thr Trp Val Leu Val Gly Gly Val Val 1 ' 
370 ■ , 375 380 

\ ' ■ ■ • ' 

Ala Ala Leu Ala Ala' Tyr Cys Leu Thr Val Gly Ser Val Ala lie Val 

385 390 395 400 

Gly Arg lie lie Leu Ser Gly Lys Pro Ala lie lie Pro Asp Arg Glu 
405 410 415 

Ala Leu Tyr Gin Gin Phe Asp Glu Met Glu Glu Cys Ser Ala Ser Leu 
420 425 430 

Pro Tyr Met Asp Glu Thr Arg Ala lie Ala Gly Gin Phe Lys Glu Lys 
435 440 445 

Val Leu Gly Phe lie Ser Thr Thr Gly Gin Lys Ala Glu Thr Leu Lys 
450' 455 460 

Pro Ala Ala Thr Ser Val Trp Asn Lys Ala Glu Gin Phe Trp Ala Thr 
465 470 475 480 

Tyr 
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